Convert dataframe to data.table in R

In this article, we will discuss how to convert dataframe to data.table in R Programming Language. data.table is an R package that provides an enhanced version of dataframe. Characteristics of data.table :

Method 1 : Using setDT() method

While dataframes are available using the base R packages, data.table object is a part of the data.table package, which needs to be installed in the working space. The setDT() method can be used to coerce the dataframe or the lists into data.table, where the conversion is made to the original dataframe. The modification is made by reference to the original data structure.

Example 1:

R

# using the required library library (data.table) # declare a dataframe data_frame <- data.frame (col1 = c (1:7), col2 = LETTERS [1:7], col3 = letters [1:7]) print ( "Original DataFrame" ) print (data_frame) # converting into data.table setDT (data_frame) print ( "Resultant DataFrame" ) print (data_frame)

Output

[1] "Original DataFrame" > print (data_frame) col1 col2 col3 1 1 A a 2 2 B b 3 3 C c 4 4 D d 5 5 E e 6 6 F f 7 7 G g [1] "Resultant DataFrame" > print (data_frame) col1 col2 col3 1: 1 A a 2: 2 B b 3: 3 C c 4: 4 D d 5: 5 E e 6: 6 F f 7: 7 G g

All the missing and NA values stored in a dataframe are preserved in data.table as well. The row names are reassigned to identifiers beginning with integer values starting from 1 till the number of rows in the dataframe. The library data.table also provides other functions to verify if the R object is a data.table using is.data.table(data_frame). It returns true if the specified argument is data.table else false.

Example 2:

R

# using the required library library (data.table) # declare a dataframe data_frame <- data.frame (col1 = c (1, NA , 4, NA , 3, NA ), col2 = c ( "a" , NA , "b" , "e" , "f" , "G" ), row.names = c ( "row1" , "row2" , "row3" , "row4" , "row5" , "row6" )) print ( "Original DataFrame" ) print (data_frame) # converting into data.table setDT (data_frame) print ( "Resultant DataFrame" ) print (data_frame) # checking if the dataframe is data table print ( "Check if data table" ) print ( is.data.table (data_frame))

Output

[1] "Original DataFrame" col1 col2 row1 1 a row2 NA row3 4 b row4 NA e row5 3 f row6 NA G [1] "Resultant DataFrame" col1 col2 1: 1 a 2: NA 3: 4 b 4: NA e 5: 3 f 6: NA G [1] "Check if data table" [1] TRUE

Explanation: The original dataframe is stored as a data.frame object and then using the setDT method the same dataframe is returned with row numbers appended at the beginning, with the row number identifier followed by a colon. The missing values, that is NA are returned as it is. Since the changes are made to the dataframe, when we check whether it is a data table or not using is.data.table(), it returns logical TRUE value.

Method 2 : Using as.data.table() method

The as.data.table() method can be used to coerce the dataframe or the lists into data.table if the specified object is not originally a data.table, and the conversion is possible. The changes are not made to the original dataframe, therefore, it creates a copy of the base object.

Example: