Let’s import our super tool for data cleaning | data wrangling :
<pre><code class="lang-python">import pandas as pd
</code></pre>
Reading a CSV file in pandas:
<pre><code class="lang-python">df = pd.read_csv("data/sample-data.csv")
</code></pre>
<ul>
<li>The <code>df.head()</code> method returns the top 5 rows by default and can accept an argument <code>n</code> to display more rows.
</li>
<li>The <a target="_blank" href="http://df.info"><code>df.info</code></a><code>()</code> method provides information about column names, non-null counts, data types, and memory usage.
</li>
<li>The <code>df.size</code> attribute returns the total number of elements in the DataFrame.
</li>
</ul>
The first step in data cleaning is removing null values. You can drop all NaN values from the DataFrame using:
<pre><code class="lang-python">df = df.dropna()
</code></pre>
Pandas has inbuilt methods like <code>split</code> and <code>replace</code>. To change the data type of a column, use:
<pre><code class="lang-python">df1["col"] = df1["col"].astype(float)
</code></pre>
To compute and set a column value, similar to a Python array, you can do:
<pre><code class="lang-python">df2["price_usd"] = (df2["price_abc"] / 87.2).round(2)
</code></pre>
This way, you can clean individual DataFrames in separate CSV files and then concatenate multiple DataFrames.
<pre><code class="lang-python">df = pd.concat([df1, df2])
df.info()
</code></pre>
You can confirm the cleaned data with <a target="_blank" href="http://df.info"><code>df.info</code></a><code>()</code>.
will continue with more topics on EDA and data analysis with pandas :D pick a simple csv file for practice
thanks for reading

Let’s import our super tool for data cleaning | data wrangling :

```python
import pandas as pd
```

Reading a CSV file in pandas:

```python
df = pd.read_csv("data/sample-data.csv")
```

* The `df.head()` method returns the top 5 rows by default and can accept an argument `n` to display more rows.
    
* The [`df.info`](http://df.info)`()` method provides information about column names, non-null counts, data types, and memory usage.
    
* The `df.size` attribute returns the total number of elements in the DataFrame.
    

The first step in data cleaning is removing null values. You can drop all NaN values from the DataFrame using:

```python
df = df.dropna()
```

Pandas has inbuilt methods like `split` and `replace`. To change the data type of a column, use:

```python
df1["col"] = df1["col"].astype(float)
```

To compute and set a column value, similar to a Python array, you can do:

```python
df2["price_usd"] = (df2["price_abc"] / 87.2).round(2)
```

This way, you can clean individual DataFrames in separate CSV files and then concatenate multiple DataFrames.

```python
df = pd.concat([df1, df2])
df.info()
```

You can confirm the cleaned data with [`df.info`](http://df.info)`()`.

will continue with more topics on EDA and data analysis with pandas :D  
pick a simple csv file for practice

thanks for reading

Cleaning data with pandas