Data collection


#1

This is a proposed new standard glossary term. See this post for background on this review track. To comment on the term below either click the blue “Reply” button at the bottom or select a passage of text in the term and click the “Quote” pop-up to create a comment about that section only.

Short Definition: A logical grouping of (research) datasets that share a common aspect or concept.

Extended Definition: A logical grouping of (research) datasets that share a common aspect or concept. A Data collection is the highest in the hierarchy of data groupings (data collection, data set, data granule) and comprises a grouping of datasets that have a strong connection and is organised coherently around a single element or concept (e.g., model, instrument).

Synonyms:

Acronym:

Related Terms:

Sources:

Term Lead: Lesley Wyborn


#2

I think you will run into some confusion here. The term itself is not problematic, and in fact, is quite useful. Where you might find problems is with people mistaking this definition with the actual act of gathering/collecting data, which is also called data collection. This sense refers to ‘how’ you go about collecting data (and what data you actually choose to collect), and not how that data is organized thematically.

Within the Canadian context, researchers are encouraged to engage in responsible research data management practice by, among other things, developing and following Data Management Plans (DMPs). The main resource for this in Canada is DMP Assistant (developed by Portage).

The DMP Assistant breaks down DMPs into six areas. Notably, the first area is “Data Collection”. This area tries to answer the following questions:

  • What types of data will you collect, create, link to, acquire and/or record?
  • What file formats will your data be collected in? Will these formats allow for data re-use, sharing and long-term access to the data?
  • What conventions and procedures will you use to structure, name and version-control your files to help you and others better understand how your data are organized?

The DMP Online tool used by European researchers uses the same term, “Data Collection” in roughly the same sense as noted above.

My concern is that researchers may start to use these two terms for data collection in ways that create confusion. I just raise this as a possible concern.

Matt


#3

Once again, Matt, thank you for your comments. Yes, I agree. The solution in this case is to provide two definitions for the term “data collection.” (1) In the context of how data are organized: A logical grouping of research data sets … ; (2) In the context of how and what data are collected: What types of data will be collected, …


#4

Hi Claire,

I like this better, as it shows the different usages for the term ‘data collection’. I think the so long as we point out that the term may be used to refer to two things (i.e. logical groupings, and the actual gathering of data) then all looks good and the term will be functionally useful.


#5

This topic was automatically closed after 0 minutes. New replies are no longer allowed.


#6

#7

This topic was automatically closed after 0 minutes. New replies are no longer allowed.