Guidelines 

"MathNet57789"

The ASSISTments Foundation is excited to share this data.  It is devoid of any personal identifiable information (PII) and our privacy policy allows us to create research datasets and share them so long as they do not include any student PII.  Our mission is to "impact advancements in math instruction to make teaching and learning more evidence-based and aligned to the diverse needs of students." Research with public datasets is one way we carry out this mission. 

Responsible Use Guidelines for Dataset: “MathNet57789


This Responsible Use Guideline (“Guideline”) governs your use of the dataset “MathNet57789” (“Dataset”). As used in these Guidelines, “you” refers to any user of the Dataset, and “we” and “us” refer to Worcester Polytechnic Institute.


The Dataset is intended for research and educational purposes and is provided on a free and open basis for the benefit of the general public. By sharing this dataset, we aim to encourage innovation and address the technology gaps that prevent us from fully understanding students' writing.


This data set has three different types of content and thus has multiple Creative Commons licenses that apply to different parts of the data sets. While we provide this information to be helpful to you, this is not legal advice and you should address all legal questions to a lawyer.


The first type of data: The math problems themselves (we call them the problem bodies ) have been pulled (with some minor modifications to make it easy for ASSISTments to give feedback) from open-source curriculum providers. These open-source curriculum providers have released their content under different CC licenses.  We have made it easy to understand who authored the content  (and the respective CC licenses you need to honor) by having a field in the data set that provides an indicator of the origins of the problem. Here are the three original curriculum and a note to you of their licenses so you can give appropriate credit.  




The second type of data: This dataset contains student answers in the form of images.  You cannot commercialize these images. Therefore we are releasing these under CC-BY-NC-4.0. 


The third type of data:  We are releasing the text annotations of the images (i.e., the image captions), the human-authored question-answer pairs, and the synthetic question-answer pairs under CC-BY-4.0.


Personally Identifiable Information

We believe we have removed all Personally Identifiable Information (PII) from the images shared in the database; do not attempt to identify the author of the images. If you see any in the data as you are using it for any purpose, you must notify us by emailing etrials@assistments.org within 30 days and share the precise image you found with PII. Due to the possibility that the dataset might be updated due to a found piece of PII, we require researchers to use the most updated publicly hosted versions. As such, you are required to check at least once a month to see if we have posted a new version. 


Open Science

As part of our responsibilities to the National Science Foundation and the tenants of Open Science, we share de-identified datasets that our lab has created. We ask you include the name of the dataset in your work so that others can replicate it.