Generate metadata for a dataset in R

Generate metadata for a dataset in R

Looking at some data in R is a crucial part of any analysis. Even before we start with anything we want to understand the data in hand. Sometimes our datasets are too big to look at each column one by one. Here comes an easy way to generate metadata for your dataset. Although there are various aspects and features you may want to look at, still there are few common that you can look with a simple utility package called metadata available on GitHub.  You can simply install it and use it like a charm.


As of now, this package is currently available only on Github so you need to have devtools plugin to install it.

Install devtools (if you don’t have it)


Load devtools and install metadata


Yup! you are ready to use it.

metadata defines two functions only (at least when I am writing this)

  • getmode
  • generateMeta

generateMeta() is general purpose function that we can use to generate metadata. This function returns a dataset that contains names of the columns and some key properties about them.

let’s try this on built-in iris dataset



          name na_Count blanks unique min max range medians mean mode
1 Sepal.Length        0      0     35 4.3 7.9   3.6    5.80 5.84  5.0
2  Sepal.Width        0      0     23 2.0 4.4   2.4    3.00 3.06  3.0
3 Petal.Length        0      0     43 1.0 6.9   5.9    4.35 3.76  1.4
4  Petal.Width        0      0     22 0.1 2.5   2.4    1.30 1.20  0.2
5      Species        0      0      3 0.0 0.0   0.0    0.00 0.00  0.0



Leave a Reply

Your email address will not be published. Required fields are marked *