How to create a DataFrame object in Pandas


In this article I’m going to show how to create a Pandas DataFrame object in two different ways.

Create a DataFrame from a dictionary

import pandas as pd
import numpy as np
cars = pd.DataFrame({
  'model_id': [10010, 20020, 30030, 40040, 50050, 60060, 70070],
  'maker':['Audi', 'BMW', 'Kia', 'Honda', 'Ford', 'Kia', 'Toyota'],
  'model':['Q7', '5-Series', 'Ceed', 'CR-V', 'Kuga', 'Rio', 'Yaris'],
  'category':['SUV', 'sedan', 'hatchback', 'SUV', 'SUV', 'sedan', 'hatchback'],
  'price':[80000, 60000, 25000, 35000, 30000, 20000, 30000],
  'stock_count':[3, 6, 2, 14, 12, 13, np.nan]})

Each key of the dictionary defines the name of the column and the value (list object) for that key defines the column values.

The code above will create a DataFrame object from a dictionary. The result is:

The module numpy is not required, but I used it to show how to specify a NaN value in the list – np.nan. We could as well just specify None in the list, but I wanted to draw your attention to the fact that NaN is available in the numpy module.

Create a DataFrame from a CSV file

Another way to create a DataFrame is to import all data from a CSV (comma separated values) file. The following code will create a DataFrame from a file cars.csv:

import pandas as pd 
df = pd.read_csv('/path/to/file/cars.csv')

The result DataFrame looks like this:

The cars.csv file content is:


The last line is missing the information after the last comma (value for stock_count), and the first line is missing the value for price column.


Now you know 2 ways how to create and initialize a DataFrame object in Pandas.