fixed up file path, fixed up some gitignores, add version support, working on a better install and binary compilation
This commit is contained in:
266
README.md
266
README.md
@@ -1,168 +1,196 @@
|
||||
|
||||
# Convertions
|
||||
|
||||
The `convertions` project is a collection of Python scripts designed to convert between various file formats and extract content from different types of files. This toolset includes scripts for converting CSV, JSON, Excel, HTML, Markdown, YAML, PNG, JPG, PDF files, and more.
|
||||
The `convertions` project is a comprehensive suite of Python scripts designed to handle various file format conversions and content extractions. It supports operations on CSV, JSON, Excel, HTML, Markdown, YAML, PNG, JPG, PDF, audio, video, and more.
|
||||
|
||||
This tool is modular, highly extensible, and packaged for ease of use. You can run it as a Python script or compile it into a standalone binary for seamless deployment.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Installation](#installation)
|
||||
- [Usage](#usage)
|
||||
- [CSV to Excel](#csv-to-excel)
|
||||
- [CSV to JSON](#csv-to-json)
|
||||
- [Excel to CSV](#excel-to-csv)
|
||||
- [HTML to Markdown](#html-to-markdown)
|
||||
- [JSON to CSV](#json-to-csv)
|
||||
- [Markdown to HTML](#markdown-to-html)
|
||||
- [YAML to Markdown](#yaml-to-markdown)
|
||||
- [PNG to JPG](#png-to-jpg)
|
||||
- [JPG to PNG](#jpg-to-png)
|
||||
- [PDF to JPG](#pdf-to-jpg)
|
||||
- [JPGs to PDF](#jpgs-to-pdf)
|
||||
- [Image to Markdown](#image-to-markdown)
|
||||
- [PDF to Markdown](#pdf-to-markdown)
|
||||
- [Global Options](#global-options)
|
||||
- [Available Commands](#available-commands)
|
||||
- [Adding New Scripts](#adding-new-scripts)
|
||||
- [Virtual Environment](#virtual-environment)
|
||||
- [Compilation into a Binary](#compilation-into-a-binary)
|
||||
- [Dependencies](#dependencies)
|
||||
- [License](#license)
|
||||
- [Acknowledgments](#acknowledgments)
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
1. **Clone the Repository**:
|
||||
```bash
|
||||
git clone https://github.com/yourusername/convertions.git
|
||||
cd convertions
|
||||
```
|
||||
### 1. Clone the Repository
|
||||
```bash
|
||||
git clone https://github.com/yourusername/convertions.git
|
||||
cd convertions
|
||||
```
|
||||
|
||||
2. **Set Up the Virtual Environment**:
|
||||
```bash
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
```
|
||||
### 2. Set Up the Virtual Environment
|
||||
Create and activate a Python virtual environment to isolate dependencies:
|
||||
```bash
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
```
|
||||
|
||||
3. **Install Dependencies**:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
### 3. Install Dependencies
|
||||
Install the required Python packages using the virtual environment's pip:
|
||||
```bash
|
||||
./venv/bin/pip install -r requirements.txt
|
||||
```
|
||||
|
||||
4. **Add the Convertions Directory to PATH**:
|
||||
- For Bash:
|
||||
```bash
|
||||
echo 'export PATH="$HOME/codeWS/Python3/convertions:$PATH"' >> ~/.bashrc
|
||||
source ~/.bashrc
|
||||
```
|
||||
- For Zsh:
|
||||
```bash
|
||||
echo 'export PATH="$HOME/codeWS/Python3/convertions:$PATH"' >> ~/.zshrc
|
||||
source ~/.zshrc
|
||||
```
|
||||
### 4. Optional: Add Convertions to Your PATH
|
||||
To make the `convertions` command globally accessible:
|
||||
- For Bash:
|
||||
```bash
|
||||
echo 'export PATH="$HOME/convertions:$PATH"' >> ~/.bashrc
|
||||
source ~/.bashrc
|
||||
```
|
||||
- For Zsh:
|
||||
```bash
|
||||
echo 'export PATH="$HOME/convertions:$PATH"' >> ~/.zshrc
|
||||
source ~/.zshrc
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### CSV to Excel
|
||||
Convert a CSV file to an Excel file.
|
||||
The `convertions` tool accepts various commands corresponding to supported operations. Each command is mapped to a specific utility script.
|
||||
|
||||
### Global Options
|
||||
- `--help` or `-h`: Display help and a list of available commands.
|
||||
- `--version` or `-v`: Print the version of the `convertions` tool.
|
||||
|
||||
```bash
|
||||
convertions csvtoexcel <input_csv_path> <output_excel_path>
|
||||
convertions --help
|
||||
convertions --version
|
||||
```
|
||||
|
||||
### CSV to JSON
|
||||
Convert a CSV file to a JSON file.
|
||||
### Available Commands
|
||||
|
||||
Each command has its usage syntax. Below is the full list:
|
||||
|
||||
#### General Format
|
||||
```bash
|
||||
convertions csvtojson <input_csv_path> <output_json_path>
|
||||
convertions <command> <input_path> <output_path>
|
||||
```
|
||||
|
||||
### Excel to CSV
|
||||
Convert an Excel file to a CSV file.
|
||||
```bash
|
||||
convertions excelto_csv <input_excel_path> <output_csv_path>
|
||||
```
|
||||
#### File Operations
|
||||
- **CSV to Excel**: `csvtoexcel <input_csv_path> <output_excel_path>`
|
||||
- **CSV to JSON**: `csvtojson <input_csv_path> <output_json_path>`
|
||||
- **CSV to YAML**: `csvtoyaml <input_csv_path> <output_yaml_path>`
|
||||
- **Excel to CSV**: `exceltocsv <input_excel_path> <output_csv_path>`
|
||||
- **Excel to JSON**: `exceltojson <input_excel_path> <output_json_path>`
|
||||
- **HTML to Markdown**: `htmltomd <input_html_path> <output_md_path>`
|
||||
- **HTML to PDF**: `htmltopdf <input_html_path> <output_pdf_path>`
|
||||
- **Image to Markdown (OCR)**: `imagetomd <input_image_path> <output_md_path>`
|
||||
- **JPG to PNG**: `jpgtopng <input_jpg_path> <output_png_path>`
|
||||
- **JSON to CSV**: `jsontocsv <input_json_path> <output_csv_path>`
|
||||
- **JSON to Excel**: `jsontoexcel <input_json_path> <output_excel_path>`
|
||||
- **JSON to YAML**: `jsontoyaml <input_json_path> <output_yaml_path>`
|
||||
- **Markdown Table to CSV**: `mdtabletocsv <input_md_path> <output_csv_path>`
|
||||
- **Markdown to DOCX**: `mdtodocx <input_md_path> <output_docx_path>`
|
||||
- **Markdown to HTML**: `mdtohtml <input_md_path> <output_html_path>`
|
||||
- **Markdown to PDF**: `mdtopdf <input_md_path> <output_pdf_path>`
|
||||
- **Markdown to YAML**: `mdtoyaml <input_md_path> <output_yaml_path>`
|
||||
- **Merge PDFs**: `mergepdfs <output_pdf_path> <input_pdf1> <input_pdf2> ...`
|
||||
- **PDF to JPG**: `pdftojpg <input_pdf_path> <output_jpg_base_path>`
|
||||
- **PDF to Markdown**: `pdftomd <input_pdf_path> <output_md_path>`
|
||||
- **PDF to Text**: `pdftotext <input_pdf_path> <output_txt_path>`
|
||||
- **PNG to JPG**: `pngtojpg <input_png_path> <output_jpg_path>`
|
||||
- **Text to Speech**: `texttospeech <input_text_path> <output_audio_path>`
|
||||
- **Video to Audio**: `videotoaudio <input_video_path> <output_audio_path>`
|
||||
- **YAML to JSON**: `yamltojson <input_yaml_path> <output_json_path>`
|
||||
- **YAML to Markdown**: `yamltomd <input_yaml_path> <output_md_path>`
|
||||
|
||||
### HTML to Markdown
|
||||
Convert an HTML file to a Markdown file.
|
||||
```bash
|
||||
convertions htmltomd <input_html_path> <output_md_path>
|
||||
```
|
||||
|
||||
### JSON to CSV
|
||||
Convert a JSON file to a CSV file.
|
||||
```bash
|
||||
convertions jsontocsv <input_json_path> <output_csv_path>
|
||||
```
|
||||
|
||||
### Markdown to HTML
|
||||
Convert a Markdown file to an HTML file.
|
||||
```bash
|
||||
convertions mdtohtml <input_md_path> <output_html_path>
|
||||
```
|
||||
|
||||
### YAML to Markdown
|
||||
Convert a YAML file to a Markdown file.
|
||||
```bash
|
||||
convertions yamltomd <input_yaml_path> <output_md_path>
|
||||
```
|
||||
|
||||
### PNG to JPG
|
||||
Convert a PNG image to a JPG image.
|
||||
```bash
|
||||
convertions pngtojpg <input_png_path> <output_jpg_path>
|
||||
```
|
||||
|
||||
### JPG to PNG
|
||||
Convert a JPG image to a PNG image.
|
||||
```bash
|
||||
convertions jpgtopng <input_jpg_path> <output_png_path>
|
||||
```
|
||||
|
||||
### PDF to JPG
|
||||
Convert a PDF file to JPG images (one per page).
|
||||
```bash
|
||||
convertions pdftojpg <input_pdf_path> <output_jpg_path>
|
||||
```
|
||||
|
||||
### JPGs to PDF
|
||||
Combine multiple JPG images into a single PDF file.
|
||||
```bash
|
||||
convertions jpgstopdf <output_pdf_path> <input_jpg_path1> <input_jpg_path2> ...
|
||||
```
|
||||
|
||||
### Image to Markdown
|
||||
Extract text content from an image (JPG/PNG) using OCR and convert it to a Markdown file.
|
||||
```bash
|
||||
convertions imagetomd <input_image_path> <output_md_path>
|
||||
```
|
||||
|
||||
### PDF to Markdown
|
||||
Extract text content from a PDF file and convert it to a Markdown file.
|
||||
```bash
|
||||
convertions pdftomd <input_pdf_path> <output_md_path>
|
||||
```
|
||||
---
|
||||
|
||||
## Adding New Scripts
|
||||
|
||||
To add a new script to the convertions toolset:
|
||||
To extend `convertions` with additional commands:
|
||||
|
||||
1. Place the new script in the \`~/codeWS/Python3/convertions\` directory.
|
||||
2. Ensure the script is executable:
|
||||
1. **Add the Script**: Place the script in the `utils` directory.
|
||||
2. **Make It Executable**: Ensure the script has execute permissions:
|
||||
```bash
|
||||
chmod +x ~/codeWS/Python3/convertions/new_script.py
|
||||
chmod +x <script_name>.py
|
||||
```
|
||||
3. Update \`convertions.py\` to include the new command and map it to the script.
|
||||
3. **Update the Mapping**: Edit `convertions.py` and add a new entry to the `SCRIPT_MAP` dictionary:
|
||||
```python
|
||||
"newcommand": "new_script.py"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Virtual Environment
|
||||
|
||||
The convertions toolset uses a virtual environment to manage dependencies. Ensure the virtual environment is activated before running any scripts:
|
||||
To ensure proper dependency management, activate the virtual environment before running or modifying the `convertions` tool:
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
```
|
||||
|
||||
To deactivate the virtual environment, use:
|
||||
To deactivate the virtual environment:
|
||||
```bash
|
||||
deactivate
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Compilation into a Binary
|
||||
|
||||
The `install.sh` script simplifies the process of creating a standalone binary using PyInstaller. It also allows for optional installation to `/usr/local/bin` for system-wide usage.
|
||||
|
||||
### Steps:
|
||||
1. Run the `install.sh` script:
|
||||
```bash
|
||||
./install.sh
|
||||
```
|
||||
2. Follow the prompts to decide whether to move the compiled binary to `/usr/local/bin`. If skipped, the binary will remain in the `./dist` directory.
|
||||
|
||||
### Key Features of `install.sh`:
|
||||
- Checks and installs dependencies in the virtual environment.
|
||||
- Compiles all scripts and utilities into a single binary.
|
||||
- Optionally installs the binary globally for ease of use.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
This project relies on several Python libraries and external tools. Ensure the following are installed:
|
||||
- Python 3.11+
|
||||
- `wkhtmltopdf` (for HTML to PDF conversion)
|
||||
- External dependencies in `requirements.txt`, managed by the virtual environment:
|
||||
- `pandas`
|
||||
- `PyMuPDF`
|
||||
- `pytesseract`
|
||||
- `markdown`
|
||||
- `pdfkit`
|
||||
- `gTTS`
|
||||
- `pillow`
|
||||
- `moviepy`
|
||||
- `PyYAML`
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the MIT License - see the LICENSE file for details.
|
||||
This project is licensed under the MIT License. See the `LICENSE` file for details.
|
||||
|
||||
---
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- [Pillow](https://python-pillow.org/)
|
||||
The following libraries and tools power `convertions`:
|
||||
- [Pillow (PIL)](https://python-pillow.org/)
|
||||
- [PyMuPDF](https://pymupdf.readthedocs.io/)
|
||||
- [pytesseract](https://pypi.org/project/pytesseract/)
|
||||
- [pdfkit](https://pypi.org/project/pdfkit/)
|
||||
- [Markdown](https://pypi.org/project/Markdown/)
|
||||
- [MoviePy](https://zulko.github.io/moviepy/)
|
||||
- [gTTS](https://pypi.org/project/gTTS/)
|
||||
- [PyYAML](https://pypi.org/project/PyYAML/)
|
||||
- [wkhtmltopdf](https://wkhtmltopdf.org/)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user