Machine Learning

Pandas: Merging datasets

The industry is dealing with chunks of data gathered from various sources. To perform analysis of data there is a huge possibility to gather data from various sources and convert into one dataset.

To do this Pandas makes our task very easy. Pandas provide various functions as listed below using which we can merge datasets seamlessly:

  1. Append
  2. Concat
  3. Join
  4. Merge
Let's see how merge data using above stated function. Also, we will be understanding the difference between them so that we know when to use what.

SQLAlchemy - ORM for Python

The post explains about the SQLAlchemy package which is used for connecting to the Object -Relational databases (SQLite, Postgres, MySql etc).

Here we will be discussing about how to connect to the database using SQLAlchemy package and create database, create tables in database, Inserting data in the tables, various ways to select data from the table, Delete operations, update operations and lot more.

This post assumes that reader has basic knowledge about SQL.

Advance concept of Pandas

In this post we will be covering some advanced concepts, that pandas have to offer. The post explains about below concepts:

  1. Creating a dataset from lists
  2. Using the plot function of pandas
  3. Pandas with time series
  4. Reindexing the index in pandas
  5. Resampling of data
  6. Chaining and filtering
  7. Grouping of data
  8. Transforming data
Let's get started and understand pandas in depth.

Machine Learning: Cleaning data

In this post we will be understanding various techniques used to clean dataset. Before cleaning data we need to understand our data. This post also explains about way to understand the data we are dealing with.

Cleaning of data is one of the crucial step in the entire process of creating machine learning model.

Let's get started and master the crucial process of cleaning datasets.

Importing data using Python

In today's world, there is a lot of data being generated from various devices. The format of data varies from flat files to tabular structure. In this post, we will be looking into python packages, for importing data using python. We will be looking at techniques to import following file types using python packages:

  1. Flat files - .txt, .csv files
  2. Pickled file
  3. Excel files
  4. SAS files
  5. STATA file
  6. HDF5 files
  7. mat file
  8. Relational database
  9. Reading data from web
Let's get started with importing data from various file formats.

Python packages for Data Science

It's really important to know about packages that are useful when we are solving data science problem. I will be talking about following packages that are available in Python, and are used to solve data related problems:

  1. Numpy
  2. Matplotlib
  3. Pandas
  4. scikit-learn
We will be using pip to install above packages. Let's get started.

Introduction to machine learning

Machine Learning is really making a buzz in the industry. So its right time to get familiar with it. I will be starting up with the basics and in my upcoming posts, I will be going to advance topics of machine learning.

Excited! Let's get started.

Subscribe to Machine Learning