sinc-LLM Training Dataset: 15 Real-World Structured Prompts

sinc-prompt Dataset

275 observations of prompt decomposition using the Nyquist-Shannon spectral format for AI model training

275
Observations
15
Domains
6
Bands per Entry
CC BY 4.0
License

Dataset Description

The sinc-prompt dataset contains structured decompositions of natural language prompts into the 6-band sinc format, derived from the Nyquist-Shannon sampling theorem applied to specification space.

Each observation transforms a single raw prompt (1 sample) into a complete 6-band signal reconstruction, eliminating the aliasing (hallucination) that occurs when LLMs receive under-specified inputs.

The dataset spans 15 professional domains: software engineering, business strategy, legal, healthcare, finance, marketing, product management, data science, customer success, software architecture, DevOps/SRE, and human resources.

Research paper: DOI 10.5281/zenodo.19152668 | Source: github.com/mdalexandre/sinc-llm

Format Specification

Every sinc prompt follows this JSON schema:

{
  "formula": "x(t) = \u03a3 x(nT) \u00b7 sinc((t - nT) / T)",
  "T": "specification-axis",
  "fragments": [
    {"n": 0, "t": "PERSONA",     "x": "Who should answer (role, expertise, experience level)"},
    {"n": 1, "t": "CONTEXT",     "x": "Situation, facts, background, environment"},
    {"n": 2, "t": "DATA",        "x": "Specific inputs, measurements, code, documents"},
    {"n": 3, "t": "CONSTRAINTS", "x": "Rules, boundaries, success/fail criteria (42.7% of quality)"},
    {"n": 4, "t": "FORMAT",      "x": "Output structure, tables, lists, code blocks (26.3% of quality)"},
    {"n": 5, "t": "TASK",        "x": "The objective in 1-2 sentences"}
  ]
}

Band Importance Weights

BandFragmentWeightDescription
n=3CONSTRAINTS42.7%Rules, boundaries, criteria -- the most impactful band
n=4FORMAT26.3%Output structure shapes response organization
n=0PERSONA7.0%Role identity sets expertise framing
n=1CONTEXT6.3%Situational facts ground the response
n=2DATA3.8%Specific inputs reduce ambiguity
n=5TASK2.8%Objective provides direction

Sample Entries

#Raw PromptTask TypeDomainBands
1 Review this Python function for bugs and performance issues code_review Software Engineering 6/6
2 My React app crashes when I click the submit button bug_debugging Software Engineering 6/6
3 Research the AI coding assistant market for me market_research Business Strategy 6/6
4 Review this SaaS contract for any issues legal_contract_review Legal 6/6
5 What could cause persistent fatigue and joint pain? medical_diagnosis Healthcare 6/6

Showing 5 of 275 entries. View all 15 published examples with full sinc JSON.

Download

JSONL Format (Machine-Readable)

One JSON object per line. Each object contains:

{
  "raw_prompt": "string",
  "sinc_json": { ... full 6-band sinc object ... },
  "task_type": "string",
  "domain": "string"
}

Download sinc-examples.jsonl

15 examples published. Full 275-observation dataset available upon request.

Citation

If you use this dataset in research or training, please cite:

@misc{alexandre2026sinc, author = {Alexandre, Mario}, title = {sinc-LLM: Nyquist-Shannon Prompt Decomposition for Large Language Models}, year = {2026}, publisher = {Zenodo}, doi = {10.5281/zenodo.19152668}, url = {https://doi.org/10.5281/zenodo.19152668}, note = {Dataset: 275 observations, 15 domains, 6-band spectral format} }

License

This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

You are free to share and adapt the material for any purpose, including commercial use, provided you give appropriate credit.

x(t) = Σ x(nT) · sinc((t - nT) / T)  —  github.com/mdalexandre/sinc-llm