# Tarot Spread Recognition Model Methods

## Objective

*Developing a model that detects a user’s 5-card Tarot spread from an image and converts it into structured text*

## Key Points

Highlights of this experiment:&#x20;

1. Image Processing: Detect and classify each Tarot card in the spread.
2. OCR: If the image has text (like card names), extract it.
3. Layout Analysis: Recognize the card positions and their spread meaning.
4. Convert to Text: Generate a structured textual output describing the spread.

## Methods Considerations

Relying on methods without the  multimodals - basically implementing programmically (and manually).

Two methods in implementing this:

1. Leverage pre-trained OCR tools:

   Problem: The image must be cropped before hand. This means that a box window representing the card positioning is expected so image has to relative to a fixed angle of shot to be usable.&#x20;
2. Building CNN for Object Detection: This method considers building object detection model using CNN layers. The objective here is to approximate the box window coordinates (top left corner, top right corner, bottom left corner, bottom right corner) for each positioning card.&#x20;

Alternatively,

1. Combining the first 2 methods where I rely on the method 1 as the base model in application. Over time, the CNN model is used for adapting to user feedback / corrections.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://whoamimi.gitbook.io/blog/projects/tarotarot-ai-fortune-teller/ai-ml-data-science-stack/trial-and-error/tarot-spread-recognition.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.