I'm writing a data availability statement and sharing data. Help me: 1) State where data can be accessed (repository

Prompt
I'm writing a data availability statement and sharing data. Help me: 1) State where data can be accessed (repository
GeminiGoogle
I'm writing a data availability statement and sharing data. Help me: 1) State where data can be accessed (repository

A strong Data Availability Statement (DAS) is crucial for transparency, reproducibility, and enabling others to build upon your work. It should clearly state where your data can be found, under what conditions, and how it can be accessed.

Here's a breakdown of how to write your DAS, with different scenarios and examples:


Key Information to Include in a DAS:


  • What data? (Briefly describe the dataset/materials)

  • Where is it? (Repository name)

  • How to access it? (DOI, URL, or specific instructions for restricted data)

  • Under what terms? (Licensing information, if applicable)

  • Any restrictions? (If not fully open, state the reason and process for access)



Scenarios and Examples:

Scenario 1: Data is in an Open Access Repository (Recommended & Most Common)

This is the ideal scenario, promoting maximum discoverability and reuse.

  • Examples of Repositories: Zenodo, Figshare, Dryad, Open Science Framework (OSF), institutional repositories, domain-specific repositories (e.g., NCBI GEO for gene expression, PDB for protein structures).

  • Key elements: Repository name, DOI (Digital Object Identifier), URL, license.


Example 1.1: Fully Open Data in a General Repository

"The datasets generated and/or analyzed during the current study are openly available in the Zenodo repository, at [https://doi.org/10.5281/zenodo.1234567], under a CC BY 4.0 license."

Example 1.2: Raw Data and Analysis Scripts in a Repository

"The raw data supporting the findings of this study, along with the analysis scripts, are openly available in the Figshare repository at [https://doi.org/10.6084/m9.figshare.12345678.v1]. The data are published under a Creative Commons Attribution 4.0 International Public License (CC BY 4.0)."

Example 1.3: Data in an Institutional Repository

"The data supporting the findings of this study are available in the [Your University Name] Institutional Repository, [https://example.edu/data/dataset-xyz], with the persistent identifier [hdl:12345/67890]."

Scenario 2: Data is Available as Supplementary Material with the Publication

Some journals host supplementary data directly on their website alongside the article. This is less ideal for data citation and long-term discoverability than dedicated repositories but is common.

Example 2.1: Supplementary Files with the Article

"The supplementary data supporting the findings of this study are available within the article and its supplementary files on the journal's website, accessible via the article's DOI: [https://doi.org/10.1002/xyz.12345]."

Scenario 3: Data is NOT Publicly Available Due to Restrictions (e.g., Privacy, Ethics, Commercial)

This is common for sensitive data (e.g., human participant data, proprietary information). Avoid simply stating "data available upon request" without further detail, as many journals discourage this due to lack of guarantee.

Example 3.1: Restricted Access Due to Participant Privacy/Ethics

"The datasets generated and/or analyzed during the current study contain sensitive participant information and are therefore not publicly available. Access to these data may be granted upon reasonable request to the [relevant ethics committee/institutional review board/data custodian] and subject to appropriate data sharing agreements and ethical review. Requests should be directed to the corresponding author [Corresponding Author's Email Address] who can facilitate this process."

Example 3.2: Restricted Access Due to Proprietary/Commercial Reasons

"The datasets generated and/or analyzed during the current study are proprietary and contain commercially sensitive information. Therefore, they are not publicly available. Data may be made available from the [Company Name] upon reasonable request and subject to a formal data access agreement. Please contact [Contact Person/Department Email Address] for further information."

Scenario 4: A Combination of Publicly Available and Restricted Data

You might have some data openly available and other parts restricted.

Example 4.1: Mixed Availability

"The processed data and analysis scripts are publicly available in the Open Science Framework (OSF) repository at [https://doi.org/10.17605/OSF.IO/ABCDE] under a CC BY 4.0 license. However, the raw participant-level data contain sensitive personal information and are not publicly available due to ethical restrictions. Access to the raw data may be granted upon reasonable request to the [Ethics Committee Name] and approval of a data access request. Please contact the corresponding author [Corresponding Author's Email Address] for guidance on the application process."

Scenario 5: Data Available Upon Request (Last Resort, Often Discouraged)

Only use this if no suitable repository exists and there are no specific restrictions preventing public deposit. Most journals now strongly prefer data be in a public repository.

Example 5.1: Data on Request (with caveats)

"The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. While efforts were made to deposit all relevant data, a suitable public repository for this specific data type could not be identified at the time of publication."

(Note: This is generally considered the weakest option and may not be accepted by all journals.)


Important Tips for Your DAS:


  • Check Journal Requirements: Always consult the specific journal's "Instructions for Authors" regarding data availability statements and preferred repositories.

  • Get a DOI: If using a repository, ensure your dataset gets a persistent identifier like a DOI. This makes it citable and permanently discoverable.

  • Choose a License: For open data, a permissive license like CC BY (Creative Commons Attribution) or CC0 (Public Domain Dedication) is highly recommended.

  • Use Open Formats: When sharing data, use open, non-proprietary file formats (e.g., CSV, TXT, JSON, TIFF) where possible, rather than proprietary formats (e.g., .xls, .sps, .mat) that require specific software.

  • FAIR Principles: Aim for your data to be Findable, Accessible, Interoperable, and Reusable. Placing it in a good repository with a DOI and license helps achieve this.

  • Consistency: Ensure your DAS is consistent with your Data Management Plan (DMP), if you have one.


Choose the example that best fits your situation and customize it with your specific repository, DOI, and any relevant details. Good luck!

{DateTime.UtcNow:HH:mm}
GrokxAI
I'm writing a data availability statement and sharing data. Help me: 1) State where data can be accessed (repository

Example Data Availability Statement

Here's a concise template for a data availability statement, focusing on stating where your data can be accessed via a repository. You can customize it based on your specific details (e.g., the repository name, dataset DOI, or any access restrictions). This follows common academic and publishing guidelines (like those from journals such as Nature or PLOS) to ensure transparency and reproducibility.

Template:
"The data supporting the findings of this study are openly available in [Repository Name] at [URL or DOI], reference number [Dataset Identifier]."

Example (using a real repository like Zenodo):
"The data supporting the findings of this study are openly available in Zenodo at https://zenodo.org/records/[Your DOI or Record ID], reference number [e.g., 1234567]."

Tips for Implementation


  • Choose a Repository: If you haven't already, use a suitable one like Zenodo (free, DOI assignment), Figshare (easy uploads), Dryad (life sciences focus), or domain-specific ones (e.g., NCBI for genomics). Ensure it's FAIR-compliant (Findable, Accessible, Interoperable, Reusable).

  • What to Include: Specify if data is fully open, requires login, or has embargoes. If sensitive data is involved, mention ethical restrictions (e.g., "Contact the corresponding author for access due to privacy concerns").

  • Length and Placement: Keep it brief (1-2 sentences) and place it at the end of your manuscript or in a dedicated section.

  • Verification: Double-check the repository link/DOI works and complies with your journal's policies (e.g., some require data to be deposited before submission).


If you provide more details (e.g., your field of study, repository choice, or specific restrictions), I can refine this further!
{DateTime.UtcNow:HH:mm}

Analysis

This comparison demonstrates the different approaches each AI model takes when responding to the same prompt. Here are the key differences observed:

Response Characteristics

Gemini: Provides a direct response with 91 sentences.

Grok: Provides a direct response with 25 sentences.

Key Takeaways

  • Each model brings unique strengths to this type of query
  • Response styles vary significantly between models
  • Consider your specific use case when choosing between these models

Try This Comparison Yourself

Want to test these models with your own prompts? Visit SNEOS.com to compare AI responses side-by-side in real-time.


This comparison was generated using the SNEOS AI Comparison ToolPublished: October 15, 2025 | Models: Gemini, Grok