As developers, we endure this torture every day. A simple regular expression can handle data cleaning, but the tool only provides basic filtering. I want to insert an API call into the process, but I find that the tool only allows the use of preset integration services. When encountering special file format processing, I can only watch the process get stuck.
This feeling of powerlessness of "seeing the needs but not reaching the solution" is like giving you a Swiss Army knife but welding the most important blade.
In oomol studio, we redefined the interaction method of workflow tools to make it truly understand the working habits of programmers. Imagine that when you are building a data processing process, you suddenly find that you need to add a special data conversion step-at this time you can directly insert a code editor window. This code block naturally integrates into the whole process like a Lego block, and is on par with other visual nodes.
The process of writing code is as smooth as in a familiar IDE. You can use Python, TypeScript, and JavaScript to write code. In oomol studio, dependency management uses a containerized solution to ensure stability. When you need to use a third-party library, you can ensure that the version and behavior of the third-party library remain consistent when the workflow runs on any machine. There is no need for users to manually handle virtual environments or package installations, and common libraries are available out of the box.
In the video, I fully demonstrate how to quickly build a custom image processing pipeline with code slots.
Do you also collect a lot of scanned PDF documents? Those academic papers, e-books or work materials, although the content is precious, are very difficult to read - rigid layout, unadjustable fonts, always need to zoom in and out when reading on mobile phones.
Now, these PDFs can be easily converted into comfortable EPUB format through pdf-craft. Just like organizing a pile of paper documents into a portable e-book, you can finally browse these contents in the most suitable way for you on your favorite EPUB reader: adjust the font size, switch to night mode, or even listen to AI reading.
pdf-craft is an open source library dedicated to processing scanned book PDFs. It can accurately identify text content, headers and footers, reference annotations, etc. in PDF files. It can maintain the coherence of cross-page content and restore the correct reading order. In addition, it will use LLM to build a complete EPUB contents structure.
It is very simple to use pdf-craft in oomol. First, create a blank project. Then type "pdf-craft" in the search box in the oomol store to find it.
Drag the “Analyse PDF” and “Generate EPUB” blocks onto the empty flow. Then, connect their output_dir and analysed_dir fields as shown in the figure.
Then, set the pdf field to the PDF source file to be processed, and then set the epub_file_path to the converted EPUB file path. Finally, click the Run button in the upper right corner to start the conversion.
Are you tired of the bad experience of messy formatting and no layout after e-book translation? EPUB Translator's intelligent translation technology not only perfectly preserves the exquisite layout of the original book, but also has a unique bilingual comparison function, making language learning more intuitive and efficient than ever before.
Whether reading original novels or studying professional literature, you can now get precisely aligned bilingual texts, with all illustrations, chapters and special formats fully preserved. The original sentence and the translation are presented side by side, helping you to easily compare and learn, just like having a translation mentor at your side at any time, making cross-language reading a truly pleasant learning journey.
epub-translator is an open source library that uses AI big models to automatically translate EPUB e-books and 100% retains the format, illustrations, table of contents and layout of the original book. It is encapsulated as a shared block in oomol and can be found in the oomol store.
First, enter the Community page in the left navigation bar and search for the "books rranslator" template. This preset template contains all the necessary operating environment and configuration. Click the "Use" button to quickly initialize a new project. The system will automatically create a workspace containing all the dependencies of epub-translator, and there is no need to manually install any components.
After the project is created, you will see a preset workflow interface. Click the source_file field on the block to upload the e-book file to be translated. The standard epub format is supported here.
Next, enter the key parameter setting stage. In the language selection drop-down menu, you can specify the target translation language in the language field, such as Simplified Chinese, English, or Japanese. Then, you need to configure which large language model to use for translation. We recommend DeepSeek Chat as the default translation engine, which excels in maintaining literary and technical accuracy. In addition, you can also add prompts in the prompt field to tell the large language model how to handle considerations such as character names and terminology.
The translation process is fully automated, and you can view the progress in real time in the log window. After processing, the system will generate a new bilingual e-book. When you open it, you will find that all the original format elements are perfectly preserved - illustrations remain in their original positions, special fonts are rendered correctly, and the directory structure is automatically generated in bilingual versions. What's more thoughtful is that the system automatically processes various metadata inside the e-book to ensure that the author, publisher and other information can be correctly displayed in the new language environment.
Have you ever considered running your Workflow on a server accessible to everyone, sharing your work with others, or running your Workflow as a background service for easy invocation at any time?
Now you can achieve this with OOMOL Studio. In this article, I will detail how to export a Workflow as an image and run it in the background. I will implement a Workflow that translates a PDF and converts it to an EPUB e-book, allowing me to read literature from foreign authors on my mobile device without language barriers.
You can download the specific Workflow implementation from translate-pdf-to-epub. You can follow this article to run the Workflow as a service on any computer or cloud platform you want.
To read this article, you need basic knowledge of image containers.
If you already have a completed Workflow, you can skip to Export Image
The functionality can be broken down into the following steps:
Convert a PDF file to an EPUB file.
Translate the converted EPUB file.
Save the translated EPUB file to a specified location.
The process is simple, and we will use existing Blocks from the OOMOL community.
First, we create a project called translate-pdf-to-epub, then install the following Packages from the store:
pdf-craft
pdf-craft can convert PDF files to EPUB files
books-translator
books-translator can translate EPUB files
Then, we use the Analys PDF and Genrate EPUB Blocks from the pdf-craft Package, and the Translate epub book Block from the books-translator Package, chaining them together to form the Workflow.
Required Blocks
Implemented Workflow
The Workflow is now complete. Next, we need to fill in the parameters. The following parameters are required:
The path of the PDF file to process
The output path for the final EPUB file
The Translate epub book Block's default output language is Chinese. Adjust the parameters as needed.
You can clear the output_dir parameter to store intermediate products in memory. Closing the application will release them.
Parameters to Fill In
Since the PDF file input requires a PDF address, we need to place the PDF file to be processed into the OOMOL Studio environment so the application can read it. Copy the file into the OOMOL space from the left panel:
Place the file inside the OOMOL Studio application container
Then, select the PDF file address and the output EPUB file address and extension:
Completed Parameters
Next, run the Workflow. After some time, you will see the output file, and the content will be translated.
Workflow Output Result
You can see the article has been translated into Chinese.
At this point, we have developed and run a Workflow. Currently, the Workflow can only run inside the OOMOL Studio application. Next, we will detach the Workflow from the OOMOL Studio environment and run it independently.
Return to the OOMOL Studio Home page, find the Workflow you just developed, right-click and select Export as Image, choose the export folder, and after a while, you will get an image file with the same name as the Workflow.
After obtaining the image file, use the docker command to load the image file. In the terminal, run
docker load -i ~/Downloads/translate-pdf-to-epub.tar.zst # Use the path where you saved the image file
After running, you can find the image:
Image File
Since the image runs outside OOMOL Studio, we cannot identify the user running the container. The current Workflow uses AI for translation, and OOMOL will convert the translation cost into credits. Using an API-key allows the credits to be deducted from the corresponding user's account, enabling the Workflow to run normally.
If the Workflow does not use built-in services like AI, or uses your own AI service, you still need to pass in the API-key when starting the container, but no credits will be deducted.
Once the API-key is hidden, it cannot be retrieved again. Please keep it safe.
After obtaining the API-key, run the following command to start the container:
docker run --privileged -p 3000:3000 -e OOMOL_API_TOKEN={OOMOL_API_TOKEN} -v $HOME/oomol-storage:/oomol-driver/oomol-storage localhost/translate-pdf-to-epub:latest
After running the image, you can access the started container at http://localhost:3000. At this point, the previous Workflow is running as a local service.
In this example, the Workflow starts by inputting a file path. However, after the Workflow is exported as an image, all paths inside the image container refer to paths inside the container. If you directly pass a host path to the Workflow, the Workflow inside the container cannot access the file.
Here, we mount the host's $HOME/oomol-storage path to /oomol-driver/oomol-storage inside the container. This way, adding files to $HOME/oomol-storage on your computer is equivalent to placing files in /oomol-driver/oomol-storage inside the container. This allows file exchange between the host and the container.
If you want to deploy the image to a cloud server, you also need to meet this requirement: the Workflow inside the container must be able to access the actual file path. If you find mounting files cumbersome, you can change the input file path to a network address, such as uploading to AWS S3 or other network file management services, then download and read the file inside your Workflow. This way, file access always happens inside the image container, avoiding file-not-found issues.
Use the POST method to call http://localhost:3000/v1/tasks to run a Workflow and create a task.
As mentioned in Workflow Development, before starting a Workflow, we need to fill in parameters for several Nodes. Here, we pass the parameters in the request:
Parameters for Starting a Task
The flowName / nodeId / handle information for the parameters can be found in Query Workflows.
Here, the input file path is /oomol-driver/oomol-storage/test.pdf. Since we previously mapped the /oomol-driver/oomol-storage folder to $HOME/oomol-storage during Running the Image, simply place test.pdf in $HOME/oomol-storage, and the Workflow can read the file. We name the output file local-service-output.
This article introduced how to export a Workflow from OOMOL Studio as an image and run it independently.
Users can deploy the image in any environment, turning it into a continuously running service. This allows the Workflow to maximize its value, whether on a local server, cloud platform, or in a team collaboration environment, enabling efficient reuse and sharing of automated processes.
In today's rapidly evolving digital era, automated workflows have become a key tool for boosting productivity. From simple task connections to complex business process automation, a variety of tools have emerged to meet users' growing automation needs. However, as business complexity and customization requirements increase, traditional workflow tools are beginning to show their limitations.
Against this backdrop, OOMOL Studio was born. It not only inherits the advantages of existing tools but also achieves revolutionary breakthroughs in key technologies. By supporting a complete coding environment, local execution, image export, and a refined developer experience, OOMOL Studio is gradually approaching the ultimate form of automated workflow tools.
This article will delve into the development of automated workflows, the capability boundaries of current tools, and how OOMOL Studio is becoming the most powerful automated workflow solution through technological innovation.
There are many automated workflow tools on the market today, such as Zapier, n8n, Dify, etc. Each tool has a rich community and a large user base.
Zapier connects with over 7000 applications, n8n has more than 2000 templates, Dify has established itself in the AI field, and new workflow tools are constantly emerging. Overall, they all strive to reduce workload and improve efficiency for customers.
Automated workflows are like assembly lines in factories, turning every step in production into reusable units, linking them together, and running them automatically. This achieves clarity, control, and efficiency, ultimately improving productivity and saving time.
Zapier connects various applications on the market as units, including various AI Agents. For most users, the number of connections Zapier offers is sufficient to meet most simple needs, such as reading emails and replying via AI, or reading a spreadsheet and having AI analyze its performance. Many customers' needs stop here, so Zapier is enough.
This is a linear execution process, completing work step by step. In reality, workflows may not be so ideal; the steps are not always strictly sequential, and their relationships are more like a network, where some Nodes may be skipped as needed.
Linear workflow orchestration
However, business is constantly evolving, and real-world requirements always change. When your target customers change, or a provider in your workflow faces closure, you have to adjust your workflow: either find a replacement provider or implement the functionality yourself.
After automating simple needs, most customers will try to automate more complex and variable tasks. No one wants to be stuck with repetitive work. At this point, workflow platforms limited to fixed applications run into trouble. They can only let developers add new features and release them to users, which usually takes time. Therefore, Zapier provides features like Code and Path, hoping users can solve some special problems themselves.
That's why tools like n8n claim to be more flexible—they support a better code editing experience, and their workflow orchestration is more complex, able to handle more variable problems. In n8n, workflows can be orchestrated freely, with more logic control Nodes such as if and switch. These built-in tools make workflows more flexible and applicable to a wider range of scenarios.
Networked workflow orchestration
This is the limit of what most workflow tools on the market can do today. They have implemented most programming concepts, making process control very free, and support code so users can meet some small-scale customization needs. Although code cannot rely on third-party Packages, since it runs in the cloud, most people accept this limitation.
OOMOL's Technological Breakthroughs and New Possibilities
On this basis, most users have completed more complex business. But are there users with even higher-level needs? The answer is yes. Data analysts and scientists invest a lot of effort through Python code, using modules like pandas or numpy to achieve their goals, so most workflows on the market cannot meet their needs.
Zapier community discussion about importing code libraries
n8n community discussion about importing code libraries
We can see that, to handle various business scenarios, workflow tools are becoming more and more like programming. After all, programs are Turing complete, which means the closer you get to programming, the more you can do. That's why workflow tools now all introduce code modules.
So why don't they directly support full programming? The reason is that these tools run in the cloud, and supporting code compilation and dependency management in the cloud is very difficult. The more users there are, the more isolated environments are needed, making it impossible to achieve from a cost perspective.
What about local application support? n8n is working on this, but supporting different languages on multiple platforms is still a huge challenge. Objectively, most needs are simple, and automating these needs and charging based on the time saved is already profitable. Investing further in programming support may not be cost-effective.
So why does OOMOL Studio still take on this difficult and uncertain task?
Current workflow tools are full of tasks like converting one file format to another, or generating a message based on search content and posting it to social media. You know these are not the complete business customers want. After converting a file, they need to share or analyze it; posting to social media is to attract users or gain data or revenue, but the remaining steps are too complex for current workflow tools to handle with simple if/else logic.
So it's not that users don't want to automate their business—it's that it's not possible yet.
Therefore, OOMOL Studio supports code so users can truly automate all their business. The Python and Node.js communities have a wealth of open-source solutions for various problems. We hope users can use these existing tools to solve problems, or build automated Flows themselves as needed. We believe that truly high-value work is relatively complex, and the profit customers can earn should depend on the difficulty of the problem and the time saved.
OOMOL Studio's convenient dependency installation
At the same time, local execution can utilize users' own computing resources. Not all workflows require powerful hardware; some businesses are complex but can be handled by ordinary computers. Local execution can save users unnecessary costs.
Since code is Turing complete, why not just write code instead of using OOMOL Studio?
In terms of flexibility, code is undoubtedly the highest. But the business users want to achieve is always time-sensitive. If you implement it by writing code, you need to install a development environment, choose a language and framework, build the application, and then publish and install it—the whole process is an engineering project. For individual users, this is a huge workload, and if requirements change, the process may have to be repeated.
OOMOL Studio has a built-in coding environment. After the workflow is completed, you only need to fill in basic information to publish and share it directly, greatly shortening the time from development to delivery. Users can also adjust any Node at any time, retaining the advantages of workflow tools. This helps users quickly iterate and deliver.
Let's return to the ultimate goal of automated workflow tools: to have computers automatically run most quantifiable work, evaluate the results, and output them to users, thereby saving a lot of time and improving human productivity.
We don't believe that articles on social media like "I made tens of thousands of dollars with this AI video workflow" are really trying to improve everyone's productivity. Their goal is just your consulting fee, or simply advertising. They only automate the simplest part of a real problem and share it repeatedly. In fact, they do improve productivity to some extent, but the truly difficult part is the customized process that requires manual user involvement.
OOMOL Studio chooses to explore this difficult path. Maybe we are not the ultimate solution, but at least we have made some efforts—giving users who want to truly solve complex problems and automate them the opportunity to achieve their goals.