<?xml version="1.0"?>
<oembed><version>1.0</version><provider_name>Portfolio Oscar Javier Munoz Marciales</provider_name><provider_url>https://artechsano.pro</provider_url><author_name>oscarxxi</author_name><author_url>https://artechsano.pro/author/oscarxxi/</author_url><title>Getting Started with Pandas: A Beginner&#x2019;s Guide - Portfolio Oscar Javier Munoz Marciales</title><type>rich</type><width>600</width><height>338</height><html>&lt;blockquote class="wp-embedded-content" data-secret="UhXVmBhOmZ"&gt;&lt;a href="https://artechsano.pro/articles/getting-started-with-pandas-a-beginners-guide/"&gt;Getting Started with Pandas: A Beginner&#x2019;s Guide&lt;/a&gt;&lt;/blockquote&gt;&lt;iframe sandbox="allow-scripts" security="restricted" src="https://artechsano.pro/articles/getting-started-with-pandas-a-beginners-guide/embed/#?secret=UhXVmBhOmZ" width="600" height="338" title="&#x201C;Getting Started with Pandas: A Beginner&#x2019;s Guide&#x201D; &#x2014; Portfolio Oscar Javier Munoz Marciales" data-secret="UhXVmBhOmZ" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" class="wp-embedded-content"&gt;&lt;/iframe&gt;&lt;script type="text/javascript"&gt;
/* &lt;![CDATA[ */
/*! This file is auto-generated */
!function(d,l){"use strict";l.querySelector&amp;&amp;d.addEventListener&amp;&amp;"undefined"!=typeof URL&amp;&amp;(d.wp=d.wp||{},d.wp.receiveEmbedMessage||(d.wp.receiveEmbedMessage=function(e){var t=e.data;if((t||t.secret||t.message||t.value)&amp;&amp;!/[^a-zA-Z0-9]/.test(t.secret)){for(var s,r,n,a=l.querySelectorAll('iframe[data-secret="'+t.secret+'"]'),o=l.querySelectorAll('blockquote[data-secret="'+t.secret+'"]'),c=new RegExp("^https?:$","i"),i=0;i&lt;o.length;i++)o[i].style.display="none";for(i=0;i&lt;a.length;i++)s=a[i],e.source===s.contentWindow&amp;&amp;(s.removeAttribute("style"),"height"===t.message?(1e3&lt;(r=parseInt(t.value,10))?r=1e3:~~r&lt;200&amp;&amp;(r=200),s.height=r):"link"===t.message&amp;&amp;(r=new URL(s.getAttribute("src")),n=new URL(t.value),c.test(n.protocol))&amp;&amp;n.host===r.host&amp;&amp;l.activeElement===s&amp;&amp;(d.top.location.href=t.value))}},d.addEventListener("message",d.wp.receiveEmbedMessage,!1),l.addEventListener("DOMContentLoaded",function(){for(var e,t,s=l.querySelectorAll("iframe.wp-embedded-content"),r=0;r&lt;s.length;r++)(t=(e=s[r]).getAttribute("data-secret"))||(t=Math.random().toString(36).substring(2,12),e.src+="#?secret="+t,e.setAttribute("data-secret",t)),e.contentWindow.postMessage({message:"ready",secret:t},"*")},!1)))}(window,document);
//# sourceURL=https://artechsano.pro/wp-includes/js/wp-embed.min.js
/* ]]&gt; */
&lt;/script&gt;
</html><thumbnail_url>https://artechsano.pro/wp-content/uploads/2025/11/Gemini_Generated_Image_vo2mchvo2mchvo2m.png</thumbnail_url><thumbnail_width>1024</thumbnail_width><thumbnail_height>1024</thumbnail_height><description>If you&#x2019;re new to Python and curious about data analysis, you&#x2019;ve probably heard of Pandas. This powerful library is an essential tool for anyone working with data&#x2014;from spreadsheets to databases to web-scraped information. It&#x2019;s the foundation for most data science workflows in Python. In this post, we&#x2019;ll explore what Pandas is, why it&#x2019;s so useful, and walk through the core operations you need to start analyzing data today. &#x1F4E6; What Is Pandas? Pandas is an open-source Python library built for data manipulation and analysis. It provides high-performance, easy-to-use data structures that make working with &#x201C;relational&#x201D; or &#x201C;labeled&#x201D; data (like you&#x2019;d find in a spreadsheet) simple and intuitive. It introduces two primary data structures: Series: A one-dimensional labeled array. Think of it as a single column in a spreadsheet, like a list of ages, but with custom labels (called an index). DataFrame: A two-dimensional labeled data structure with columns of potentially different types. This is the workhorse of Pandas. It&#x2019;s the whole spreadsheet&#x2014;a collection of Series (columns) that share the same index (rows). You can easily create them from scratch to experiment: &#x1F40D; filename.py import pandas as pd # A Series (one column) s = pd.Series([10, 20, 30], index=['a', 'b', 'c']) # A DataFrame (multiple columns) data = {'Product': ['Apples', 'Oranges', 'Bananas'], 'Price': [0.5, 0.4, 0.25]} df = pd.DataFrame(data) &#x1F680; Why Use Pandas? &#xA0; Here&#x2019;s why Pandas is a favorite among data analysts, scientists, and Python developers: Simple &amp; Intuitive: The syntax is designed to be readable and expressive, letting you accomplish complex tasks in just a few lines. Flexible Data Handling: It can read and write data from a huge variety of formats, including CSV, Excel, JSON, SQL databases, HTML, and more. Powerful Operations: It makes complex operations simple. You can effortlessly filter, group, merge, pivot, and reshape data. It also has specialized, powerful tools for working with time series data. Performance: It&#x2019;s built on top of NumPy, which means many of its operations are vectorized and optimized for speed. Integration: It plays perfectly with other libraries in the scientific Python ecosystem, like NumPy (for computation), Matplotlib/Seaborn (for plotting), and Scikit-learn (for machine learning). &#xA0; &#x1F6E0;&#xFE0F; Getting Started: Installation &amp; Loading Data &#xA0; To begin, you&#x2019;ll need to install Pandas. The most common way is with pip: &#x1F40D; pip install pandas Once installed, you import it into your Python script (the as pd is a standard, widely-used alias): &#x1F40D; import pandas as pd While you can create DataFrames from scratch (as shown above), you&#x2019;ll usually load data from a file. Let&#x2019;s load a simple CSV file: &#x1F40D; # Reads a CSV file into a DataFrame df = pd.read_csv('your_data.csv') # You can also easily read from other file types # df_excel = pd.read_excel('your_data.xlsx') &#x1F50D; Exploring Your Data: The Basic Workflow &#xA0; Once your data is loaded into a DataFrame, the first step is always inspection. You need to understand what you&#x2019;re working with. Let&#x2019;s assume we loaded a df. Here are the most critical commands: &#xA0; 1. Inspecting Your Data &#xA0; See the first few rows: df.head() (by default, it shows 5). See the last few rows: df.tail() Get a concise summary: df.info() This is crucial! It shows row/column counts, column names, data types (e.g., int64, float64, object), and, most importantly, the number of non-null (non-empty) values. Get quick statistics: df.describe() For numerical columns, this shows count, mean, standard deviation, min, max, and percentiles. See the dimensions: df.shape (returns a tuple: (rows, columns)) See the column names: df.columns &#xA0; 2. Selecting &amp; Filtering Data &#xA0; This is where you&#x2019;ll spend a lot of your time. Select a single column (returns a Series): df['column_name'] Select multiple columns (returns a new DataFrame): df[['col1', 'col2']] Select rows by position (integer-based): df.iloc[0] (gets the very first row) df.iloc[0:5] (gets the first five rows) Select rows by label/index (label-based): df.loc['index_label'] (if your index is a name, e.g., &#x2018;a&#x2019;, &#x2018;b&#x2019;, &#x2018;c&#x2019;) Conditional Filtering (Boolean Masking): This is the most powerful way to select data. df[df['age'] &gt; 30] (selects all rows where the &#x2018;age&#x2019; column is over 30) df[(df['age'] &gt; 30) &amp; (df['city'] == 'New York')] (use &amp; for &#x201C;and&#x201D;, | for &#x201C;or&#x201D;) &#xA0; 3. Manipulating &amp; Cleaning Data &#xA0; Create a new column: df['new_column'] = df['col1'] + df['col2'] Handle missing data: df.dropna() (drops all rows with any missing values) df.fillna(value=0) (fills all missing values with 0) Rename columns: df = df.rename(columns={'old_name': 'new_name', 'another_old': 'another_new'}) &#xA0; 4. Aggregating &amp; Grouping Data &#xA0; This is the key to summarizing your data. The &#x201C;split-apply-combine&#x201D; pattern is famous for a reason. The groupby method: df.groupby('category').mean() (calculates the mean of all numeric columns for each unique category) Let&#x2019;s look at the simple example from your original post. Imagine you have sales data: &#x1F40D; # 1. Create a new 'revenue' column df['revenue'] = df['price'] * df['quantity'] # 2. Group by product and sum the revenue revenue_by_product = df.groupby('product')['revenue'].sum() print(revenue_by_product) You can also get multiple statistics at once using .agg(): &#x1F40D; # Get total revenue and average price per product stats = df.groupby('product').agg({ 'revenue': 'sum', 'price': 'mean' }) print(stats) 5. Combining DataFrames &#xA0; Merging (like a SQL JOIN): Combines DataFrames based on a common column. pd.merge(df1, df2, on='id_column') Concatenating (stacking): Stacks DataFrames on top of each other (if they have the same columns). pd.concat([df1, df2]) &#xA0; &#x1F4C8; Bonus: Quick Visualization &#xA0; One of the best features of Pandas is its built-in plotting, which uses Matplotlib under the hood. This lets you get a quick visual check of your data without importing another library. After our groupby example above, you could instantly plot the results: &#x1F40D; # Assuming 'revenue_by_product' is the Series from the last example revenue_by_product.plot(kind='bar', title='Total Revenue by Product') Or, you could plot a histogram of a single column from your original DataFrame: &#x1F40D; df['price'].plot(kind='hist', bins=20, title='Price Distribution') &#x1F4DA; Learning Resources &#xA0; Ready to go deeper? These resources are fantastic for leveling up your Pandas skills: Official Pandas Documentation: The user guide and API reference are essential. Kaggle Learn: They have an excellent, free, hands-on micro-course on Pandas. Real Python: Features numerous</description></oembed>
