2 posts tagged with "workflow automation"

View all tags

How to Export a Workflow as an Image and Run It as a Service?

· 11 min read
AlicZhang
Engineer of OOMOL

Have you ever considered running your Workflow on a server accessible to everyone, sharing your work with others, or running your Workflow as a background service for easy invocation at any time?

Now you can achieve this with OOMOL Studio. In this article, I will detail how to export a Workflow as an image and run it in the background. I will implement a Workflow that translates a PDF and converts it to an EPUB e-book, allowing me to read literature from foreign authors on my mobile device without language barriers.

You can download the specific Workflow implementation from translate-pdf-to-epub. You can follow this article to run the Workflow as a service on any computer or cloud platform you want.

To read this article, you need basic knowledge of image containers.

Workflow Development

If you already have a completed Workflow, you can skip to Export Image

The functionality can be broken down into the following steps:

  1. Convert a PDF file to an EPUB file.
  2. Translate the converted EPUB file.
  3. Save the translated EPUB file to a specified location.

The process is simple, and we will use existing Blocks from the OOMOL community.

First, we create a project called translate-pdf-to-epub, then install the following Packages from the store:

pdf-craft

pdf-craft can convert PDF files to EPUB files

books-translator

books-translator can translate EPUB files

Then, we use the Analys PDF and Genrate EPUB Blocks from the pdf-craft Package, and the Translate epub book Block from the books-translator Package, chaining them together to form the Workflow.

Required Blocks

Implemented Workflow

The Workflow is now complete. Next, we need to fill in the parameters. The following parameters are required:

  1. The path of the PDF file to process
  2. The output path for the final EPUB file

The Translate epub book Block's default output language is Chinese. Adjust the parameters as needed.

You can clear the output_dir parameter to store intermediate products in memory. Closing the application will release them.

Parameters to Fill In

Since the PDF file input requires a PDF address, we need to place the PDF file to be processed into the OOMOL Studio environment so the application can read it. Copy the file into the OOMOL space from the left panel:

Place the file inside the OOMOL Studio application container

Then, select the PDF file address and the output EPUB file address and extension:

Completed Parameters

Next, run the Workflow. After some time, you will see the output file, and the content will be translated.

Workflow Output Result

You can see the article has been translated into Chinese.

At this point, we have developed and run a Workflow. Currently, the Workflow can only run inside the OOMOL Studio application. Next, we will detach the Workflow from the OOMOL Studio environment and run it independently.

Export Image

Return to the OOMOL Studio Home page, find the Workflow you just developed, right-click and select Export as Image, choose the export folder, and after a while, you will get an image file with the same name as the Workflow.

Exported Image File

Running the Image

After obtaining the image file, use the docker command to load the image file. In the terminal, run

docker load -i ~/Downloads/translate-pdf-to-epub.tar.zst  # Use the path where you saved the image file

After running, you can find the image:

Image File

Since the image runs outside OOMOL Studio, we cannot identify the user running the container. The current Workflow uses AI for translation, and OOMOL will convert the translation cost into credits. Using an API-key allows the credits to be deducted from the corresponding user's account, enabling the Workflow to run normally.

If the Workflow does not use built-in services like AI, or uses your own AI service, you still need to pass in the API-key when starting the container, but no credits will be deducted.

You can generate an API-key in the OOMOL Console.

Generate API-key

Once the API-key is hidden, it cannot be retrieved again. Please keep it safe.

After obtaining the API-key, run the following command to start the container:

docker run --privileged -p 3000:3000 -e OOMOL_API_TOKEN={OOMOL_API_TOKEN} -v $HOME/oomol-storage:/oomol-driver/oomol-storage localhost/translate-pdf-to-epub:latest

After running the image, you can access the started container at http://localhost:3000. At this point, the previous Workflow is running as a local service.

File Paths Inside the Image

In this example, the Workflow starts by inputting a file path. However, after the Workflow is exported as an image, all paths inside the image container refer to paths inside the container. If you directly pass a host path to the Workflow, the Workflow inside the container cannot access the file.

Here, we mount the host's $HOME/oomol-storage path to /oomol-driver/oomol-storage inside the container. This way, adding files to $HOME/oomol-storage on your computer is equivalent to placing files in /oomol-driver/oomol-storage inside the container. This allows file exchange between the host and the container.

If you want to deploy the image to a cloud server, you also need to meet this requirement: the Workflow inside the container must be able to access the actual file path. If you find mounting files cumbersome, you can change the input file path to a network address, such as uploading to AWS S3 or other network file management services, then download and read the file inside your Workflow. This way, file access always happens inside the image container, avoiding file-not-found issues.

Calling the Workflow

After starting the image container, you can use HTTP requests to query, start, and stop the Workflow.

Visit http://localhost:3000/ui to see the HTTP request interfaces exposed by the image container:

HTTP API

For detailed interface descriptions, refer to the API documentation.

Query Workflows

Use the GET method to call http://localhost:3000/v1/flows to query the list of Workflows in the image container.

The returned content includes detailed Workflow structures, including Node information, input/output Handle names, and types.

Run Task

Use the POST method to call http://localhost:3000/v1/tasks to run a Workflow and create a task.

As mentioned in Workflow Development, before starting a Workflow, we need to fill in parameters for several Nodes. Here, we pass the parameters in the request:

Parameters for Starting a Task

The flowName / nodeId / handle information for the parameters can be found in Query Workflows.

Here, the input file path is /oomol-driver/oomol-storage/test.pdf. Since we previously mapped the /oomol-driver/oomol-storage folder to $HOME/oomol-storage during Running the Image, simply place test.pdf in $HOME/oomol-storage, and the Workflow can read the file. We name the output file local-service-output.

After the API call, you will receive a task ID.

Task ID

Query Tasks

Use the GET method to call http://localhost:3000/v1/tasks to list all tasks in the image container.

Use the GET method to call http://localhost:3000/v1/tasks/{task_id} to query task details by task ID.

You can poll these two APIs to check the task status.

When the task is complete, you will see the task status as completed and view the output result:

Completed Task

Summary

This article introduced how to export a Workflow from OOMOL Studio as an image and run it independently.

Users can deploy the image in any environment, turning it into a continuously running service. This allows the Workflow to maximize its value, whether on a local server, cloud platform, or in a team collaboration environment, enabling efficient reuse and sharing of automated processes.

Why is OOMOL Studio at the forefront of automated workflow development?

· 8 min read
AlicZhang
Engineer of OOMOL

Introduction

In today's rapidly evolving digital era, automated workflows have become a key tool for boosting productivity. From simple task connections to complex business process automation, a variety of tools have emerged to meet users' growing automation needs. However, as business complexity and customization requirements increase, traditional workflow tools are beginning to show their limitations.

Against this backdrop, OOMOL Studio was born. It not only inherits the advantages of existing tools but also achieves revolutionary breakthroughs in key technologies. By supporting a complete coding environment, local execution, image export, and a refined developer experience, OOMOL Studio is gradually approaching the ultimate form of automated workflow tools.

This article will delve into the development of automated workflows, the capability boundaries of current tools, and how OOMOL Studio is becoming the most powerful automated workflow solution through technological innovation.

A Flourishing Tool Ecosystem

There are many automated workflow tools on the market today, such as Zapier, n8n, Dify, etc. Each tool has a rich community and a large user base.

Zapier connects with over 7000 applications, n8n has more than 2000 templates, Dify has established itself in the AI field, and new workflow tools are constantly emerging. Overall, they all strive to reduce workload and improve efficiency for customers.

Automated workflows are like assembly lines in factories, turning every step in production into reusable units, linking them together, and running them automatically. This achieves clarity, control, and efficiency, ultimately improving productivity and saving time.

The Evolution of Automated Workflows

Zapier connects various applications on the market as units, including various AI Agents. For most users, the number of connections Zapier offers is sufficient to meet most simple needs, such as reading emails and replying via AI, or reading a spreadsheet and having AI analyze its performance. Many customers' needs stop here, so Zapier is enough.

This is a linear execution process, completing work step by step. In reality, workflows may not be so ideal; the steps are not always strictly sequential, and their relationships are more like a network, where some Nodes may be skipped as needed.

Linear workflow orchestration

However, business is constantly evolving, and real-world requirements always change. When your target customers change, or a provider in your workflow faces closure, you have to adjust your workflow: either find a replacement provider or implement the functionality yourself.

After automating simple needs, most customers will try to automate more complex and variable tasks. No one wants to be stuck with repetitive work. At this point, workflow platforms limited to fixed applications run into trouble. They can only let developers add new features and release them to users, which usually takes time. Therefore, Zapier provides features like Code and Path, hoping users can solve some special problems themselves.

That's why tools like n8n claim to be more flexible—they support a better code editing experience, and their workflow orchestration is more complex, able to handle more variable problems. In n8n, workflows can be orchestrated freely, with more logic control Nodes such as if and switch. These built-in tools make workflows more flexible and applicable to a wider range of scenarios.

Networked workflow orchestration

This is the limit of what most workflow tools on the market can do today. They have implemented most programming concepts, making process control very free, and support code so users can meet some small-scale customization needs. Although code cannot rely on third-party Packages, since it runs in the cloud, most people accept this limitation.

OOMOL's Technological Breakthroughs and New Possibilities

On this basis, most users have completed more complex business. But are there users with even higher-level needs? The answer is yes. Data analysts and scientists invest a lot of effort through Python code, using modules like pandas or numpy to achieve their goals, so most workflows on the market cannot meet their needs.

Zapier community discussion about importing code libraries

n8n community discussion about importing code libraries

We can see that, to handle various business scenarios, workflow tools are becoming more and more like programming. After all, programs are Turing complete, which means the closer you get to programming, the more you can do. That's why workflow tools now all introduce code modules.

So why don't they directly support full programming? The reason is that these tools run in the cloud, and supporting code compilation and dependency management in the cloud is very difficult. The more users there are, the more isolated environments are needed, making it impossible to achieve from a cost perspective.

What about local application support? n8n is working on this, but supporting different languages on multiple platforms is still a huge challenge. Objectively, most needs are simple, and automating these needs and charging based on the time saved is already profitable. Investing further in programming support may not be cost-effective.

So why does OOMOL Studio still take on this difficult and uncertain task?

Current workflow tools are full of tasks like converting one file format to another, or generating a message based on search content and posting it to social media. You know these are not the complete business customers want. After converting a file, they need to share or analyze it; posting to social media is to attract users or gain data or revenue, but the remaining steps are too complex for current workflow tools to handle with simple if/else logic.

So it's not that users don't want to automate their business—it's that it's not possible yet.

Therefore, OOMOL Studio supports code so users can truly automate all their business. The Python and Node.js communities have a wealth of open-source solutions for various problems. We hope users can use these existing tools to solve problems, or build automated Flows themselves as needed. We believe that truly high-value work is relatively complex, and the profit customers can earn should depend on the difficulty of the problem and the time saved.

OOMOL Studio's convenient dependency installation

At the same time, local execution can utilize users' own computing resources. Not all workflows require powerful hardware; some businesses are complex but can be handled by ordinary computers. Local execution can save users unnecessary costs.

Difference from Pure Code Implementation

Since code is Turing complete, why not just write code instead of using OOMOL Studio?

In terms of flexibility, code is undoubtedly the highest. But the business users want to achieve is always time-sensitive. If you implement it by writing code, you need to install a development environment, choose a language and framework, build the application, and then publish and install it—the whole process is an engineering project. For individual users, this is a huge workload, and if requirements change, the process may have to be repeated.

OOMOL Studio has a built-in coding environment. After the workflow is completed, you only need to fill in basic information to publish and share it directly, greatly shortening the time from development to delivery. Users can also adjust any Node at any time, retaining the advantages of workflow tools. This helps users quickly iterate and deliver.

Conclusion

Let's return to the ultimate goal of automated workflow tools: to have computers automatically run most quantifiable work, evaluate the results, and output them to users, thereby saving a lot of time and improving human productivity.

We don't believe that articles on social media like "I made tens of thousands of dollars with this AI video workflow" are really trying to improve everyone's productivity. Their goal is just your consulting fee, or simply advertising. They only automate the simplest part of a real problem and share it repeatedly. In fact, they do improve productivity to some extent, but the truly difficult part is the customized process that requires manual user involvement.

OOMOL Studio chooses to explore this difficult path. Maybe we are not the ultimate solution, but at least we have made some efforts—giving users who want to truly solve complex problems and automate them the opportunity to achieve their goals.