ChatGPT vs Claude vs Grok: Python Data Analysis

Prompt

Python for data analysis - pandas, numpy, matplotlib

ChatGPTOpenAI

Python for data analysis - pandas, numpy, matplotlib

Certainly! Here's a brief overview and example usage of pandas, numpy, and matplotlib in Python for data analysis:

1. NumPy

Purpose: Fundamental package for numerical computing in Python.

Use: Provides support for arrays, mathematical functions, linear algebra, random number generation, etc.

Example:

import numpy as np
# Create a numpy array
arr = np.array([1, 2, 3, 4, 5])
# Basic operations
mean = np.mean(arr)
std_dev = np.std(arr)print("Mean:", mean)
print("Standard Deviation:", std_dev)

2. pandas

Purpose: Library providing data structures and data analysis tools.

Use: Handling and manipulating tabular data (similar to spreadsheets or SQL tables).

Example:

import pandas as pd
# Create a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'Salary': [70000, 80000, 120000, 95000]
}
df = pd.DataFrame(data)
# Basic data exploration
print(df.head())
print(df.describe())# Filter data
high_salary = df[df['Salary'] > 90000]
print(high_salary)

3. Matplotlib

Purpose: Plotting library for creating static, animated, and interactive visualizations.

Use: Visualizing data distributions, trends, comparisons, etc.

Example:

import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 15, 7, 12, 9]
# Line plot
plt.plot(x, y, label='Sample Data')
# Add labels and title
plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Simple Line Plot')
plt.legend()# Show plot
plt.show()

Combined Example: Using all three together

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Generate random data using numpy
np.random.seed(0)
ages = np.random.randint(20, 60, size=100)
salaries = np.random.normal(50000, 15000, size=100)
# Create a DataFrame
df = pd.DataFrame({'Age': ages, 'Salary': salaries})
# Analyze data
print(df.describe())# Scatter plot to visualize relationship between Age and Salary
plt.scatter(df['Age'], df['Salary'], alpha=0.6)
plt.title('Age vs Salary')
plt.xlabel('Age')
plt.ylabel('Salary')
plt.show()

If you want, I can provide more detailed examples, tutorials, or explain any specific functions!

22:46

ClaudeAnthropic

Python for data analysis - pandas, numpy, matplotlib

Python for Data Analysis: pandas, numpy, matplotlib

I'll walk you through the essential libraries for data analysis in Python with practical examples.

1. NumPy - Numerical Computing Foundation

NumPy provides the foundation for numerical computing in Python.

import numpy as np
# Creating arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
# Array operations
print(f"Array shape: {arr2.shape}")
print(f"Array dtype: {arr1.dtype}")
# Mathematical operations
numbers = np.array([1, 2, 3, 4, 5])
print(f"Mean: {np.mean(numbers)}")
print(f"Standard deviation: {np.std(numbers)}")
print(f"Sum: {np.sum(numbers)}")
# Broadcasting
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
result = matrix + 10  # Adds 10 to each element
print("Matrix + 10:")
print(result)# Random number generation
random_data = np.random.normal(0, 1, 1000)  # Normal distribution
print(f"Random data mean: {np.mean(random_data):.2f}")

2. Pandas - Data Manipulation and Analysis

Pandas is built on NumPy and provides high-level data structures.

import pandas as pd
import numpy as np
# Creating DataFrames
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'Age': [25, 30, 35, 28, 32],
    'City': ['New York', 'London', 'Tokyo', 'Paris', 'Sydney'],
    'Salary': [50000, 60000, 70000, 55000, 65000]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print(f"\nDataFrame info:")
print(df.info())
# Basic operations
print(f"\nFirst 3 rows:")
print(df.head(3))
print(f"\nBasic statistics:")
print(df.describe())
# Filtering data
high_earners = df[df['Salary'] > 55000]
print(f"\nHigh earners:")
print(high_earners)
# Grouping and aggregation
city_stats = df.groupby('City')['Salary'].agg(['mean', 'count'])
print(f"\nSalary stats by city:")
print(city_stats)# Adding new columns
df['Salary_Category'] = df['Salary'].apply(
    lambda x: 'High' if x > 60000 else 'Medium' if x > 50000 else 'Low'
)
print(f"\nDataFrame with salary category:")
print(df)

3. Reading and Writing Data

```python

Create sample data and save to CSV

sample_data = pd.DataFrame({
'Date': pd.date_range('2024-01-01', periods=100),
'Sales': np.random.randint(100, 1000, 100),
'Product': np.random.choice(['A', 'B', 'C'], 100),
'Region': np.random.choice(['North', 'South', 'East', 'West'], 100)
})

Save to CSV

sampledata.tocsv('sales_data.csv', index=False)

Read from CSV

dfsales = pd.readcsv('sales_data.csv')
dfsales['Date'] = pd.todatetime(df_sales['Date']) # Convert to datetime

print("Sales data sample:")
print(df_sales.head())

Data analysis

monthlysales = dfsales.groupby(df_sales['Date'].dt.month)['Sales'].sum()
productperformance = dfsales.groupby('Product')['Sales'].agg(['sum', 'mean', 'count'])

print(f"\nMonthly sales:")

22:46

GrokxAI

Python for data analysis - pandas, numpy, matplotlib

Python is one of the most popular programming languages for data analysis due to its powerful libraries like Pandas, NumPy, and Matplotlib. Below, I’ll provide an overview of each library, along with explanations and examples to help you get started with data analysis in Python.

1. NumPy (Numerical Python)

NumPy is the foundational library for numerical computations in Python. It provides support for arrays, matrices, and high-performance mathematical functions.

Key Features:

Efficient multi-dimensional arrays (ndarray).

Broadcasting for element-wise operations.

Mathematical functions (e.g., trigonometry, statistics, linear algebra).

Random number generation.

Installation:

If not already installed, you can install NumPy via pip:

pip install numpy

Example: Basic Operations with NumPy

import numpy as np
# Create a 1D array
arr = np.array([1, 2, 3, 4, 5])
print("Array:", arr)
# Perform element-wise operations
arr_squared = arr ** 2
print("Squared Array:", arr_squared)
# Create a 2D array (matrix)
matrix = np.array([[1, 2], [3, 4]])
print("Matrix:\n", matrix)
# Matrix multiplication
result = np.dot(matrix, matrix)
print("Matrix Multiplication:\n", result)# Statistical operations
mean_val = np.mean(arr)
print("Mean of Array:", mean_val)

Output:

Array: [1 2 3 4 5]
Squared Array: [ 1  4  9 16 25]
Matrix:
 [[1 2]
 [3 4]]
Matrix Multiplication:
 [[ 7 10]
 [15 22]]
Mean of Array: 3.0

2. Pandas (Python Data Analysis Library)

Pandas is built on top of NumPy and is used for data manipulation and analysis. It provides data structures like Series (1D) and DataFrame (2D) to handle tabular data efficiently.

Key Features:

DataFrame for handling structured data (like spreadsheets or SQL tables).

Tools for reading/writing data (CSV, Excel, JSON, SQL, etc.).

Data cleaning, filtering, merging, and grouping.

Handling missing data.

Installation:

Install Pandas via pip:

pip install pandas

Example: Working with a DataFrame

import pandas as pd
# Create a dictionary to form a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Salary': [50000, 60000, 75000]
}
# Convert to DataFrame
df = pd.DataFrame(data)
print("DataFrame:\n", df)
# Basic operations
print("\nAverage Salary:", df['Salary'].mean())
# Filter data
filtered_df = df[df['Age'] > 25]
print("\nFiltered DataFrame (Age > 25):\n", filtered_df)# Add a new column
df['Bonus'] = df['Salary'] * 0.1
print("\nDataFrame with Bonus Column:\n", df)

Output:

DataFrame: Name Age Salary 0 Alice 25 50000 1 Bob 30 60000 2 Charlie 35 75000 Average Salary: 61666.666666666664 Filtered DataFrame (Age > 25): Name Age Salary 1 Bob 30 60000 2 Charlie 35 75000

DataFrame with Bonus Column: Name Age Salary Bonus 0 Alice 25 50000 5000.0 1 Bob 30 60000 6000.0 2 Charlie 35 75000 7500.0

3. Matplotlib (Matplotlib.pyplot)

Matplotlib is a plotting library for creating visualizations in Python. The pyplot module provides a simple interface for generating plots like line charts, bar charts, histograms, etc.

Key Features:

Customizable 2D and 3D plots.

Integration with Pandas and NumPy for data visualization.

Wide variety of plot types (scatter, bar, pie, etc.).

Installation:

Install Matplotlib via pip:

pip install matplotlib

Example: Plotting Data

import matplotlib.pyplot as plt
import numpy as np
# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
# Create a line plot
plt.plot(x, y, label='Linear Trend', color='blue', marker='o')
plt.title('Simple Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.grid(True)
plt.show()
# Create a bar plot using Pandas DataFrame
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Salary': [50000, 60000, 75000]}
df = pd.DataFrame(data)plt.bar(df['Name'], df['Salary'], color='green')
plt.title('Salary Bar Chart')
plt.xlabel('Name')
plt.ylabel('Salary')
plt.show()

Output:

The first code snippet generates a line plot with a linear trend.

The second snippet creates a bar chart showing salaries for different individuals.

Combining Pandas, NumPy, and Matplotlib for Data Analysis

Here’s an end-to-end example where we analyze a small dataset and visualize the results.

Example: Sales Data Analysis

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Create a sample dataset
data = {
    'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'Sales': [200, 250, 300, 280, 320],
    'Expenses': [180, 200, 220, 210, 240]
}
df = pd.DataFrame(data)
# Calculate Profit
df['Profit'] = df['Sales'] - df['Expenses']
print("DataFrame:\n", df)
# Summary statistics using NumPy
mean_sales = np.mean(df['Sales'])
print("\nAverage Sales:", mean_sales)# Plot Sales and Expenses over Months
plt.figure(figsize=(10, 5))
plt.plot(df['Month'], df['Sales'], label='Sales', color='blue', marker='o')
plt.plot(df['Month'], df['Expenses'], label='Expenses', color='red', marker='s')
plt.title('Sales vs Expenses Over Months')
plt.xlabel('Month')
plt.ylabel('Amount')
plt.legend()
plt.grid(True)
plt.show()

Output:

DataFrame: Month Sales Expenses Profit 0 Jan 200 180 20 1 Feb 250 200 50 2 Mar 300 220 80 3 Apr 280 210 70 4 May 320 240 80

Average Sales: 270.0

A line plot will be displayed showing Sales and Expenses trends over the months.

Key Takeaways

NumPy: Use for numerical computations and array operations.

Pandas: Use for data manipulation and working with tabular data (DataFrames).

Matplotlib: Use for creating visualizations to explore and communicate insights.

These libraries are often used together in data analysis workflows. Start with small datasets to practice, and gradually explore more advanced features like grouping in Pandas, matrix operations in NumPy, and advanced plotting in Matplotlib.

Let me know if you'd like deeper explanations or examples on any specific topic!

22:46

Analysis

This comparison demonstrates the different approaches each AI model takes when responding to the same prompt. Here are the key differences observed:

Response Characteristics

ChatGPT: Provides a direct response with 39 sentences.

Claude: Provides a direct response with 46 sentences.

Grok: Provides a direct response with 78 sentences.

Key Takeaways

Each model brings unique strengths to this type of query
Response styles vary significantly between models
Consider your specific use case when choosing between these models

Try This Comparison Yourself

Want to test these models with your own prompts? Visit SNEOS.com to compare AI responses side-by-side in real-time.

This comparison was generated using the SNEOS AI Comparison ToolPublished: September 30, 2025 | Models: ChatGPT, Claude, Grok