04/06/2024

Marek Gmyrek

Senior IT Architect

  • Software development

  • Insight

N8n: System Integration and Data Processing Tool

04/06/2024

Marek Gmyrek

Senior IT Architect


Intro

n8n is a no-code/low-code solution for process automation (workflow) and application integration.

Process vs. workflow

First, it is important to clearly specify the meaning of the word “workflow” in the context of the n8n solution. Well, using this word we mean a sequence of related tasks, forming a so-called directed graph. A directed graph is a collection of vertices (nodes) and the edges connecting them, with the provision that the edges have a direction (we can imagine them as arrows). The grid of one-way roads in a city, where we identify intersections as nodes, is an example of a directed graph.

Each node has a task associated with it, such as data transformation or validation, reading from or writing to external data stores, referring to a url, a control block, e.g., "if", etc. n8n has also a multitude of connectors to external systems such as databases, file systems, message brokers, etc.

Each n8n workflow has start nodes as well as end nodes. Workflows are initiated either manually, via external systems (webhook) or periodically using the built-in scheduler.

In summary, n8n can process, perhaps very complex, sets of related tasks, from start node to end node, which together form a directed graph. It should be emphasised that executing workflows in n8n does not require any human interaction, hence, in our opinion, it is better to talk about process execution rather than workflow in the context of n8n. Therefore, this solution should not be associated with BPM-class solutions in any way.

Alternatives

Certainly, n8n is not the only solution for automating workflows. The two best known and most mature solutions competing with n8n, perhaps a little more oriented towards data processing (data pipelines), are Apache Airflow (https://airflow.apache.org/) and Prefect (https://www.prefect.io/).

License

n8n is made available both as a cloud-based solution (SaaS) and as a locally installed package. The cloud version is payable, while the on-prem version is available in two licenses: Sustainable Use License and Enterprise License. The first one, which is free of charge, allows you to use and modify the software exclusively for your own internal business purposes or for non-commercial or personal use. Distribution of the software or making it available is permitted free of charge for non-commercial purposes only. In other words, any use is allowed, unless the value of a product, service or module sold derives wholly or substantially from the functionality of n8n.


More about n8n

What makes n8n unique is its very simple and intuitive interface for creating workflows. n8n allows visual creation of workflows by dragging and dropping (drag & drop) nodes and extensive configuration options. As n8n is implemented on top of the JavaScript and NodeJS stack, more demanding and technology-oriented users can integrate their own code written in this programming language.

n8n is a mature solution. Currently, n8n offers almost 400 different nodes and this number is constantly growing. This powerful collection is divided into more than a dozen groups, the most important of which are:

  • core nodes, such as for running arbitrary code (execute command), reading and writing files, file transfer via FTP or sFTP, data compression, XML conversion, data flow control (if, merge, switch, wait, schedule, trigger workflow), http requests, command execution via SSH,

  • analytics, where we can find connectors and integrations with e.g., Google Ads, Google Analytics, Grafana,

  • communication, with nodes for AMQP, MQTT protocols or AWS SES, AWS SNS, AWS SQS solutions, nodes to use APIs of services such as: ClickUp, Discord, Gmail, Mailchimp, Mailgun, Microsoft Outlook, Microsoft Teams, Reddit, RocketChat, Salesforce, Slack, Telegram, Twilio, Zammad, Zendesk, GitLab, GitHun, Kafka,

  • data stores, with nodes for AWS DynamoDB, AWS S3, Dropbox, Elasticsearch, FTP, Google BigQuery, Google Drive, Google Sheets, Microsoft OneDrive, MongoDB, MySQL, Nextcloud, Postgres, Redis, Snowflake, Strapi, TimescaleDB,

  • finance and nodes for integration with PayPal, Stripe or Wise,

  • connectors for systems such as Jira, HubSpot, Microsoft Dynamics CRM, Pipedrive, Saleforce, Shopify, WooCommerce or Slack.

The number of integration nodes is impressive. In fact, currently there is probably no relevant system for which n8n does not have a suitable connector.

Similarly, a whole bunch of ready-to-use n8n workflows (currently more than 600) are available (https://n8n.io/workflows/). You can import them into your environment and start using them without any effort. Alternatively, you can take them as a starting point and customize them to your own needs, or use them as implementation examples, good practices or even as inspiration.


N8n in Hycom

In Hycom, we use n8n primarily as a tool for systems integration and for data processing and handling.
One typical use case for n8n is data import from third-party systems. The zipped files are copied to the S3 storage, where an event is generated that initiates the workflow in the n8n via webhook. The workflow extracts the zip file and then processes each CSV file in turn. As these files can be large, processing takes place in chunks (chunk). The individual pieces are validated against an agreed format and then saved by calling the relevant API in the data stores.

Since we import different CSV formats, the essential flow, depending on the format of the imported file, triggers a corresponding subworkflow.


A Real-World Performance Insight

Obviously, additional tasks such as error handling or even authorisation (obtaining a JWT token to call the REST API) are performed alongside this essential workflow.
Other examples of n8n use cases include:

  • regular cleaning of data in data stores,

  • replication or updating of user data: The IAM component generates an event in the RabbitMQ queue, of which n8n is the consumer, and the workflow itself updates user data in several collaborating services,

  • automatic backup of workflows to Github.

Our use cases therefore cover the following areas (use case):

  • n8n workflows as an integration bus between system components,

  • integrations with third-party systems,

  • a kind of extended cronjob, i.e., the execution of complex tasks at regular time intervals,

  • data processing, validation, and conversion.

A quick review on the Internet shows that the limitation of the n8n use is only the ingenuity of the users. N8n is used for a variety of tasks, some as exotic as:

  • automated checks of documentation coverage,

  • generation of graphics for future software releases,

  • releasing software versions in the cloud,

  • detection and notification of overdue invoices,

  • scraping data from the Internet,

  • generation of prospects (leads) based on enquiries in Telegram messenger,

  • English language learning (generating a new word to be remembered as a tweet on a predefined Twitter account).

According to the documentation, n8n can be scaled relatively easily. In the so-called "queue" mode, n8n is run in several instances, of which one "main" instance controls the processing of workflows that are physically executed on the other instances ("worker instances").
In Hycom, we did not implement such a configuration, although, performing system performance tests, we observed an almost exponential increase in the CPU as well as memory requirements of our n8n instance. Upon further analysis, it became apparent that the events generated in the system were going to the AMQP connector of the n8n instance, which initiated the processing of a rather complex workflow.


Taming Event Overload

As the rate of receiving events (the connector) was much higher than the processing of the further workflow itself, n8n was uncontrollably generating further flow executions, which just resulted in saturation of both CPU and memory. The solution appeared to be to throttle incoming events – a parameter available in the configuration ("Parallel Message Processing Limit" set to 4) – on the connector to AMQP.


Summary

Our experience with n8n so far has been very positive. In our opinion, n8n is a great tool for data integration and transformation. It can be run in virtually any environment, also in the Kubernetes environment. n8n is available here in the form of a Helm chart, which is prepared so that it can be run in queue-mode, i.e., an architecture prepared for scaling. We can also save working copies of created workflows as JSON files automatically on Github – this is taken care of, nomen omen, by the ready-to-use (from the workflow template library) n8n workflow.

All in all, it looks like n8n will stay with Hycom for longer.

Maximize your IT solutions? Schedule a free consultation today!