> ## Documentation Index
> Fetch the complete documentation index at: https://developer.sandbox.co.in/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Extract structured data from income tax documents like Form 16 and Form 26AS using advanced OCR technology.

# Income Tax OCR APIs

OCR APIs are advanced document processing services that extract structured data from tax documents. Use these APIs to automatically parse Form 16 and Form 26AS documents for tax compliance and income verification workflows.

## Key Features

<CardGroup cols={2}>
  <Card title="Form 16 Extraction" icon="file-invoice">
    Extract salary details, TDS information, and investment declarations from employer-issued TDS certificates.
  </Card>

  <Card title="Form 26AS Extraction" icon="file-lines">
    Extract creditable TDS amounts, deductor details, and tax payment information from annual TDS statements.
  </Card>

  <Card title="High Accuracy" icon="check-circle">
    Advanced AI models with confidence scoring ensure reliable data extraction from various document formats.
  </Card>

  <Card title="Bulk Processing" icon="layer-group">
    Process multiple documents asynchronously with job-based processing for scalability.
  </Card>
</CardGroup>

## How It Works

<Steps>
  <Step title="Upload Document">
    Submit PDF or image file with optional metadata like taxpayer PAN and financial year.
  </Step>

  <Step title="AI Processing">
    System automatically detects document type and extracts text, tables, and structured data.
  </Step>

  <Step title="Data Validation">
    Extracted information is validated against expected patterns and formats.
  </Step>

  <Step title="Get Results">
    Receive structured JSON response with confidence scores and extracted data.
  </Step>
</Steps>

## API Categories

<CardGroup cols={2}>
  <Card title="Form 16 APIs" icon="file-invoice" href="/api-reference/it/ocr/form_16/endpoints/read_form_16">
    Extract data from TDS certificates issued by employers.
  </Card>

  <Card title="Form 26AS APIs" icon="file-lines" href="/api-reference/it/ocr/form_26as/endpoints/read_form_26as">
    Extract data from annual TDS statements from the Income Tax Department.
  </Card>
</CardGroup>

## API Endpoints

| Endpoint                | Method | Description                               |
| ----------------------- | ------ | ----------------------------------------- |
| `/it/ocr/form-16/pdf`   | POST   | Extract data from Form 16 PDF documents   |
| `/it/ocr/form-26as/pdf` | POST   | Extract data from Form 26AS PDF documents |

## Common Use Cases

* **Income Verification**: Extract salary and TDS details for loan applications and account opening
* **Tax Compliance**: Auto-populate ITR forms with extracted data from tax documents
* **Payroll Reconciliation**: Validate employee salary information against Form 16
* **Document Archival**: Index and search historical tax documents for audit purposes
* **Bulk Processing**: Process multiple employee documents for large organizations

## Integration Examples

### Basic Form 16 Extraction

```json theme={null}
{
  "file": "Base64 encoded PDF",
  "taxpayer_pan": "AAAPI0000A",
  "financial_year": "2024-25"
}
```

### Response Structure

```json theme={null}
{
  "code": 200,
  "data": {
    "document_type": "form_16",
    "confidence": 0.95,
    "extraction_details": {
      "employer": {
        "name": "ABC Technologies Pvt Ltd",
        "pan": "AABCT1234K"
      },
      "employee": {
        "name": "Rajesh Kumar",
        "pan": "BXRPK5678A"
      },
      "salary": {
        "gross_salary": 2400000,
        "basic": 1000000,
        "tds_deducted": 350000
      }
    }
  }
}
```

## Best Practices

* **Document Quality**: Upload clear, well-lit scans or digital PDFs for best extraction results
* **Complete Documents**: Ensure all pages are included for multi-page documents
* **Metadata**: Provide taxpayer PAN and financial year for better validation
* **Confidence Thresholds**: Set appropriate confidence thresholds based on your use case
* **Validation**: Always validate extracted data, especially for high-value decisions

## Related Documentation

* [Form 16 Extraction API](/api-reference/it/ocr/form_16/endpoints/read_form_16)
* [Form 26AS Extraction API](/api-reference/it/ocr/form_26as/endpoints/read_form_26as)
* [Calculator APIs](/api-reference/it/calculator/overview)
* [Report APIs](/api-reference/it/report/overview)

<AccordionGroup>
  <Accordion title="What document formats are supported?">
    The APIs support PDF documents and common image formats (JPEG, PNG). For best results, use original digital PDFs rather than scanned images.
  </Accordion>

  <Accordion title="How accurate is the OCR extraction?">
    Accuracy varies by document quality and type, but typically ranges from 85-95% confidence. Each extracted field includes a confidence score for validation.
  </Accordion>

  <Accordion title="Can I process multiple documents at once?">
    Yes, the APIs support async job-based processing for bulk document processing. Submit multiple documents and track progress via job status endpoints.
  </Accordion>

  <Accordion title="What happens if extraction confidence is low?">
    Fields with low confidence scores are flagged in the response. You can set minimum confidence thresholds or implement manual review workflows for critical data.
  </Accordion>

  <Accordion title="Is there a file size limit?">
    Individual documents should be under 10MB. For larger files or bulk processing, contact support for optimized processing options.
  </Accordion>
</AccordionGroup>
