diff --git a/README.md b/README.md index 2d97aa1..24e9282 100644 --- a/README.md +++ b/README.md @@ -68,7 +68,7 @@ The project was created as part of a university software engineering course and This project uses several Node.js packages to handle video processing, AI-based document generation, backend communication, and the desktop user interface. Below is an overview of the most important dependencies and their purpose. -Core Dependencies: +##Core Dependencies: Electron: Used to build the desktop application. It allows the project to run as a cross-platform GUI application using web technologies. @@ -117,28 +117,28 @@ Type definitions for fluent-ffmpeg to improve development experience and error c @types/cli-progress: Provides type support for progress bar functionality used during processing tasks. -##Why These Packages Are Needed +###Why These Packages Are Needed Together, these packages enable: -Processing video and audio files +*Processing video and audio files -Communicating with AI models for document generation +*Communicating with AI models for document generation -Secure handling of API keys +*Secure handling of API keys -Generating structured documents (DOCX) +*Generating structured documents (DOCX) -Providing a user-friendly desktop interface +*Providing a user-friendly desktop interface -Ensuring code quality through testing +*Ensuring code quality through testing -##API Keys and Configuration +###API Keys and Configuration The V2D – Video to Document tool uses external AI and media processing services to convert video and audio content into structured documents. To access these services, several API keys are required. For security reasons, API keys are not stored in the repository and must be provided via environment variables. - #Supported API Keys + ##Supported API Keys The project currently supports the following API keys. Depending on the configuration and selected provider, one or more of these keys may be used. @@ -170,7 +170,7 @@ Environment variable: SAYA_API_KEY Usage: Additional or experimental AI provider that can be integrated into the document generation pipeline. -How to Set API Keys +##How to Set API Keys API keys must be configured as environment variables before starting the application. @@ -200,17 +200,19 @@ The .env file must not be committed to the repository and should be listed in .g Security Notes: -API keys are injected at runtime +*API keys are injected at runtime -No secrets are stored in the source code +*No secrets are stored in the source code -Prevents accidental exposure of sensitive data +*Prevents accidental exposure of sensitive data -Supports secure collaboration in GitLab and CI/CD environments +*Supports secure collaboration in GitLab and CI/CD environments -Follows best practices for secret management +*Follows best practices for secret management -End-to-End User Guide (Video → Final Document) + + +###End-to-End User Guide (Video → Final Document) This section describes how a user can create a structured document from a video using the V2D – Video to Document tool. @@ -231,7 +233,7 @@ npm start The Electron-based GUI will open. -Upload a Video File: +#Upload a Video File: In the application interface, select Upload Video. @@ -239,7 +241,9 @@ Choose a supported video file (e.g. .mp4, .mov). The video is loaded into the system for processing. -Audio Extraction: + + +#Audio Extraction: The application automatically extracts audio from the uploaded video. @@ -247,7 +251,8 @@ This is handled internally using FFmpeg. No user interaction is required for this step. -Speech-to-Text Transcription: + +#Speech-to-Text Transcription: The extracted audio is sent to the speech-to-text service. @@ -255,7 +260,8 @@ The transcription process converts spoken content into text. The generated transcript is stored internally and used for further processing. -Select Document Type: + +#Select Document Type: The user selects a document type (e.g. Meeting Report). @@ -263,7 +269,8 @@ Each document type is based on a predefined prompt template. The selected template defines the structure and style of the final document. -Document Generation: + +#Document Generation: The transcript and selected prompt are sent to the AI service. @@ -271,7 +278,8 @@ The AI model processes the input and generates a structured document. The output is formatted in Markdown. -Document Preview: + +#Document Preview: The generated document is displayed in the application preview. @@ -279,11 +287,13 @@ Users can review the content before exporting. No manual editing is required, but validation is possible. -Export the Final Document: + +#Export the Final Document: The user exports the document in the desired format. -Supported formats include: + +#Supported formats include: Markdown (.md) @@ -291,18 +301,19 @@ Word (.docx) The document is saved locally. -Completion: + +#Completion: The final document is now ready for use. The user can repeat the process with another video if needed. -Resources +###Resources This section lists the main technologies, libraries, and external resources used in the V2D (Video to Document) project. These resources are required to understand, run, and further develop the application. -Project Dependencies +##Project Dependencies The following packages and tools are used in this project (as defined in package.json): @@ -344,7 +355,8 @@ TypeScript – Improves code quality and type safety These dependencies enable the complete end-to-end workflow from video input to structured document output. -Relevant Repositories + +##Relevant Repositories V2D Main Repository: https://gitlab.rlp.net/proj-wise2526-video2document/video2document