Dataset Profile
Data collections and scientific datasets
Profile Information
Schema Type:
Dataset
Version:
v1
Category:
Content
Profile URL:
https://llmprofiles.org/profiles/content/dataset/v1/
Quick Actions
✅ Best Practices
- Use schema:Dataset with proper name, description, and distribution information.
- Include dataset format, size, and access information.
- Add license and usage rights when available.
- Use structured metadata for better data discovery.
- Include data quality and provenance information.
❌ Avoid These
- Do not use for general content or individual data points.
- Do not use for articles or blog posts about data.
- Do not use for software applications or tools.
- Do not use for live data feeds or real-time data.
📄 Profile Definition
JSON-LD profile definition with all properties and constraints
🔧 Page Schema
JSON Schema for validating page markup and on-page structured data
📊 Output Schema
JSON Schema for RAG/ML output validation and data extraction
🎓 Training Data
Sample training data in JSONL format for fine-tuning LLMs
Implementation Examples
Learn how to implement this profile with real-world examples:
Basic Implementation
<script type="application/ld+json">
{
"@context": "https://llmprofiles.org/profiles/content/dataset/v1/",
"@type": "Dataset",
"name": "Example Dataset",
"description": "This is an example Dataset implementation."
}
</script>
Schema Information
This profile is based on Schema.org Dataset and extends it with LLM-specific properties and constraints.
Key Properties:
- @context: Profile context URL
- @type: Dataset
- name: Required name/title
- description: Required description
Validation Tools
Use these tools to validate your implementation: