Indexing, Manipulation and Visualization of data - Part 2

Amruthan Chandrasekaran updated on: 14 December 0

Day 4 Indexing, Manipulation and Visualization of data - Part 2

Now let us use the merge().

merge() is some what related to the joins in SQL, so if you google about the joins in SQL it is very easy for you to grasp these things quicker.

Ok look at these tables and we are going to merge these tables using different join operations

While we should create those data frames

df_a = pd.DataFrame({
        'roll_num': ['1', '2', '3', '4', '5'],
        'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'], 
        'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']})

df_b = pd.DataFrame({
        'roll_num': ['4', '5', '6', '7', '8'],
        'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'], 
        'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']})

df_c = pd.DataFrame({
        'roll_num': ['1', '2', '3', '4', '5', '7', '8', '9', '10', '11'],
        'test_id': [51, 15, 15, 61, 16, 14, 15, 1, 61, 16]})

roll_num	first_name	last_name
1	Alex	Anderson
2	Amy	Ackerman
3	Allen	Ali
4	Alice	Aoni
5	Ayoung	Atitches

roll_num	first_name	last_name
4	Billy	Bonder
5	Bryan	Black
6	Bran	Balwner
7	Bryce	Brice
8	Betty	Btisan

roll_num	test_id
1	51
2	15
3	15
4	61
5	16
7	14
8	15
9	1
10	61
11	16

Using these dataframe we'll try to use basic join operation, inner, left, outer, right.

Try this code snippets , please put import statements by yourself

pd.merge(df_a, df_c, on='roll_num')

pd.merge(df_a, df_b, on='roll_num', how='outer')

pd.merge(df_a, df_b, on='roll_num', how='inner')

pd.merge(df_a, df_b, on='roll_num', how='right')

pd.merge(df_a, df_b, on='roll_num', how='left')

You might wonder which one to use merge() or concat() that belongs to the data you are going to work with, you can find one variation between merge and concate as concate function join the data at bottom of another dat frame , and merge join to the right.

And that one also can be changed by changing axes value.

Note:

Check this link for the pandas functions and uses , you can navigate to all functions, their attributes and their values.

Well now we will look how to visualize the data using visualization tools

Oh! wait one minute, how did I forget this one ,

You can use apply() function which was often you are going to use in Data Manipulation.

This is very similar to for loop, as we use the keywords to access the data according to our requirement.

Let me share the codes with you , try it yourself. And also try with different datasets, data sets can be downloaded from Kaggle or in google.

import pandas as pd
import numpy as np

data_BM = pd.read_csv('data.csv')
# drop the null values
data_BM = data_BM.dropna(how="any")
# reset index after dropping
data_BM = data_BM.reset_index(drop=True)
# view the top results
data_BM.head()

# accessing row wise
data_BM.apply(lambda x: x)

# access first row
data_BM.apply(lambda x: x[0])

# accessing column wise
data_BM.apply(lambda x: x, axis=1)

# before clipping
data_BM["Fare"][:5]

# clip fare it is greater than 50
def clip_price(price):
    if price > 50:
        price = 50
    return price

# after clipping
data_BM["Fare"].apply(lambda x: clip_price(x))[:5]

data_BM["Embarked"][:5]

# label encode type 
def label_encode(type):
    if type == 'C':
        label = 0
    elif type == 'S':
        label = 1
    else:
        label = 2
    return label

# operate label_encode on every row of Outlet_Location_Type
data_BM["Embarked"] = data_BM["Embarked"].apply(label_encode)

# after label encoding
data_BM["Embarked"][:5]

Life of Mechon

Indexing, Manipulation and Visualization of data - Part 2

Day 4 Indexing, Manipulation and Visualization of data - Part 2

Recommended

Post a Comment

Post a Comment

Trending Topics

Recent Trends

Easy way to copy text from pc/laptop to mobile using PC finger

Recommended

Why should you Choose Mechanical Engineering ?

7 Things About Machine Learning You'll Kick Yourself for Not Knowing

Build form anywhere on web and store the data - not typical google forms

Machine Learning Day 1