<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Mark Karamyar]]></title><description><![CDATA[Mark Karamyar]]></description><link>https://devmarkpro.com</link><generator>RSS for Node</generator><lastBuildDate>Tue, 09 Jun 2026 06:10:51 GMT</lastBuildDate><atom:link href="https://devmarkpro.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Data Integration between databases]]></title><description><![CDATA[In an event-driven architecture, one of the common scenarios, when we are facing a monolith application, is how to get a huge amount of data out of the monolith database. This task became even more complicated when we have to take the stability and u...]]></description><link>https://devmarkpro.com/data-integration-between-databases-kafka-connect-debezium</link><guid isPermaLink="true">https://devmarkpro.com/data-integration-between-databases-kafka-connect-debezium</guid><category><![CDATA[software architecture]]></category><category><![CDATA[kafka]]></category><category><![CDATA[Kotlin]]></category><category><![CDATA[distributed system]]></category><category><![CDATA[Microservices]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Sat, 01 Jan 2022 17:51:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1641040716892/9luwr7hfg.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In an event-driven architecture, one of the common scenarios, when we are facing a monolith application, is how to get a huge amount of data out of the monolith database. This task became even more complicated when we have to take the stability and uptime of the database into account. We cannot shut down the entire database for a specific time to take a snapshot of the data and even if we can, how can we deal with changes in the primary database?</p>
<p>So, let's break the problem into two pieces:</p>
<ol>
<li>Getting a large set of data (or even the whole) out of the database</li>
<li>Syncing changes</li>
</ol>
<p>For finding an answer to these questions, we need to specify the target of the data. Where do we want to land our data at the end! The answer to this question is not as simple as it seems. the below answers are just small possibilities</p>
<ul>
<li>A destination database with the database engine, for example from a Postgresql source to a Postgresql destination</li>
<li>A destination database with a different engine, or even a different DBMS. form example from Postgresql to Mysql</li>
<li>A destination data storage with a different paradigm, for example from a relational database to an Elasticsearch or vice versa.</li>
<li>A destination cloud storage</li>
<li>An event broker </li>
<li>A simple JSON file!
This list can contain a dozen of possibilities. although in a controlled environment, we can decide to eliminate lots of possibilities in advance, there are still lots of items left on our list which means our solution needs to support a good level of abstraction and flexibility.</li>
</ul>
<p>So let's start to find/build the solution:</p>
<h2 id="heading-data-readers-and-adapters">Data readers and adapters</h2>
<p>Let's address the first challenge, getting the data. basically, if we look at it from a high-level point of view and ignore all the details we realize all we need are a set of <strong>readers</strong> and <strong>adapters</strong> </p>
<ul>
<li>Readers: Reading data from the source </li>
<li>Adapters: Writing data to the destination </li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1641040348249/qifIlwqCJ.png" alt="reader-adapter-1.png" /></p>
<p>However, usually, the situation is more complex by having multiple sources and destinations</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1641040716892/9luwr7hfg.png" alt="reader-adapter-2.png" /></p>
<p>This solution of course would work but there are a couple of caveats that need to consider.</p>
<ol>
<li>It's not easy to scale either readers or adapters. for example, how can we deal with a database with hundreds of tables with millions of records each? one possible solution could be running a reader and an adapter for each table but we need to solve the foreign keys in the destination database somehow.</li>
<li>It's hard to make the process resumable we might need extra storage for readers and adapters to achieve this task</li>
<li>Although there shouldn't be any complex logic in our readers and adapters we still have to deal with lots of effort for developing and maintaining these applications</li>
</ol>
<p>apart from the above-mentioned issues, the good news is we at least could address the first challenge, didn't we?</p>
<h2 id="heading-using-triggers-to-sync-changes">Using triggers to sync changes</h2>
<p>Dumping data from source to destination is just the first part of our challenge. We need to make sure all the changes (Create, Update, Delete) that happened in the source are landed in the destination data source, in other words, the source and destination must be eventually consistent. The simplest solution is using <strong>triggers</strong>. Here's the trigger's definition in Wikipedia:</p>
<blockquote>
<p>A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for maintaining the integrity of the information on the database.</p>
</blockquote>
<p>So we can somehow trigger a piece of code after specific types of events. Awesome, that's exactly what we wanted. All we need to do is define triggers in the source database to mirror the changes to the destination database. Here's is a rough example of how it would look like:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TRIGGER</span> onProductInsert <span class="hljs-keyword">ON</span> product
<span class="hljs-keyword">FOR</span> <span class="hljs-keyword">INSERT</span>
<span class="hljs-keyword">AS</span>

<span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> product_replica
        (productName, quantity, color)
    <span class="hljs-keyword">SELECT</span>
        (productName, quantity, color)
        <span class="hljs-keyword">FROM</span> inserted
</code></pre>
<p>it sounds like a plan the functionalities of triggers are limited. For example, think about these situations: producing a message to a Kafka topic, inserting a document to MongoDB, or updating the value of a specific cache item in Redis, even if it's possible to do all of them in a trigger, it wouldn't be easy.</p>
<h2 id="heading-using-kafka-and-kafka-connect">Using Kafka and Kafka Connect</h2>
<p>We've found a solution so far, but it should be a more efficient way to solve this problem. Actually, there is! Apache connect is exactly what we need we can address both challenges effectively.</p>
<p>Here's the definition of Kafka connect from <a target="_blank" href="https://docs.confluent.io/platform/current/connect/index.html#:~:text=Kafka%20Connect%20is%20a%20free,Kafka%20Connect%20for%20Confluent%20Platform.">Confluent</a> </p>
<blockquote>
<p>Kafka Connect is a free, open-source component of Apache Kafka® that works as a centralized data hub for simple data integration between databases, key-value stores, search indexes, and file systems. The information provided here is specific to Kafka Connect for Confluent Platform.</p>
</blockquote>
<p>Kafka connect contains two parts:</p>
<ol>
<li><strong>Connectors</strong> are responsible to read data from the source and publish them into a Kafka Cluster</li>
<li><strong>Sinks</strong> are connecting to Kafka Cluster as consumers, get the data, and put them into the destination data storage.</li>
</ol>
<p>So Connectors are what we called <strong>Data Readers</strong> and Sinks are what we called <strong>Data Adapters</strong> so far. The good news is there are tons of Sinks and Connectors that are already available in the  <a target="_blank" href="https://www.confluent.io/hub/">Confluent Hub</a> that can make our life easier.</p>
<p>So our overall architecture almost changes to this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1641048967252/qzzt21rgqf.png" alt="connect-1.png" /></p>
<p>Ok so let's get our hands dirty, using Kafka connect and see the magic.</p>
<p>The first thing we need is a Kafka cluster. Kafka uses  <a target="_blank" href="https://zookeeper.apache.org/">Zookeeper</a>  for configuration management and more importantly the consensus. They are implementing their own consensus algorithm based on  <a target="_blank" href="https://developer.confluent.io/learn/kraft/">KRaft</a>  but it's not ready for production yet.</p>
<p>I'm using  <a target="_blank" href="https://docs.docker.com/compose/">docker compose</a>  for setting up the entire stack as it's easy to use and widely accepted between developers.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">version:</span> <span class="hljs-string">"3.9"</span>
<span class="hljs-attr">services:</span>
  <span class="hljs-attr">zookeeper:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-zookeeper:6.2.1</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">zookeeper</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"2181:2181"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">ZOOKEEPER_CLIENT_PORT:</span> <span class="hljs-number">2181</span>
      <span class="hljs-attr">ZOOKEEPER_TICK_TIME:</span> <span class="hljs-number">2000</span>
  <span class="hljs-attr">broker:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-kafka:6.2.1</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">broker</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">zookeeper</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"29092:29092"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">KAFKA_BROKER_ID:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_ZOOKEEPER_CONNECT:</span> <span class="hljs-string">'zookeeper:2181'</span>
      <span class="hljs-attr">KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:</span> <span class="hljs-string">PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT</span>
      <span class="hljs-attr">KAFKA_ADVERTISED_LISTENERS:</span> <span class="hljs-string">PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092</span>
      <span class="hljs-attr">KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS:</span> <span class="hljs-number">0</span>
</code></pre>
<p>The configuration is simple, you just need to pass a couple of configs alongside the zookeeper address to the Kafka Cluster as  <a target="_blank" href="https://en.wikipedia.org/wiki/Environment_variable">environment variables</a>.</p>
<p>So let's talk about Kafka Connect and how it works. "Kafka connect" runs as a worker so we can deploy it like any other separate service in Virtual machine, AWS, GCP, Kubernetes. technically it's just a Java application. It also has an HTTP API server that runs on port <code>8083</code> by default. </p>
<p>"Kafka Connect" is scalable and you can (should) run it in a distributed node. it means each node in the cluster has to know the state of other nodes somehow. For example, when a node is added to the cluster out of the blue, it has to find answers to questions like, what is the current task? what is the status of the running task? what are the records that other nodes are working on right now? what should I pick up!?</p>
<p>No surprise that "Kafka Connect" uses the "Kafka Cluster" itself to keep the state of the jobs.</p>
<p>So what we can expect to set up "Kafka connect" with this information? let's think about it for a second, since it works as a standalone application that has to connect and talk with our Kafka cluster, it definitely needs the cluster's connections information. As it uses Kafka for keeping state it also needs a couple of internal topics for managing state between nodes.</p>
<p>"Kafka Connect" can also use and verify the schema of input and output data via Protobuf, Avro, and JSON. We have to define our schemas in the  <a target="_blank" href="https://docs.confluent.io/platform/current/schema-registry/index.html">Schema Registry</a> after that Kafka simply handle the schema of the data in producing and consuming steps by just having the schema id. For the sake of simplicity, in this tutorial, we ignore schema management and validation.
After all, the "Kafka Connect" configuration looks like this:</p>
<pre><code class="lang-yaml">  <span class="hljs-attr">schema-registry:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-schema-registry:6.2.1</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">schema-registry</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">broker</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8081:8081"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">SCHEMA_REGISTRY_HOST_NAME:</span> <span class="hljs-string">schema-registry</span>
      <span class="hljs-attr">SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS:</span> <span class="hljs-string">'broker:9092'</span>
      <span class="hljs-attr">SCHEMA_REGISTRY_LOG4J_ROOT_LOGLEVEL:</span> <span class="hljs-string">WARN</span>
  <span class="hljs-attr">connect:</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">connect</span>
    <span class="hljs-attr">build:</span>
      <span class="hljs-attr">context:</span> <span class="hljs-string">.</span>
      <span class="hljs-attr">dockerfile:</span> <span class="hljs-string">connect.Dockerfile</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">CONNECT_BOOTSTRAP_SERVERS:</span> <span class="hljs-string">'broker:9092'</span>
      <span class="hljs-attr">CONNECT_GROUP_ID:</span> <span class="hljs-string">'connect'</span>
      <span class="hljs-attr">CONNECT_CONFIG_STORAGE_TOPIC:</span> <span class="hljs-string">'data-transformer-config'</span>
      <span class="hljs-attr">CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_OFFSET_STORAGE_TOPIC:</span> <span class="hljs-string">'data-transformer-offset'</span>
      <span class="hljs-attr">CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_STATUS_STORAGE_TOPIC:</span> <span class="hljs-string">'data-transformer-status'</span>
      <span class="hljs-attr">CONNECT_STATUS_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE:</span> <span class="hljs-string">"false"</span>
      <span class="hljs-attr">CONNECT_VALUE_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">VALUE_CONVERTER_SCHEMAS_ENABLE:</span> <span class="hljs-string">"false"</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL:</span> <span class="hljs-string">"http://schema-registry:8081"</span>
      <span class="hljs-attr">CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL:</span> <span class="hljs-string">"http://schema-registry:8081"</span>
      <span class="hljs-attr">CONNECT_REST_ADVERTISED_HOST_NAME:</span> <span class="hljs-string">"localhost"</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">broker</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8083:8083"</span>
</code></pre>
<p>So far we just spin up a "Kafka connect" instance, how can we tell this instance to ship data between source and destination? As we described earlier, Kafka Connect does this with the connectors. Connectors are simply Java applications developed by the community. One of the amazing connectors for getting data out of the database is  <a target="_blank" href="https://debezium.io/">debezium</a> </p>
<h3 id="heading-what-debezium-does-and-how-it-works">What debezium does and How it works</h3>
<p>Do you still remember the original problem? We were looking to find a solution for these problems:</p>
<ol>
<li>Getting a large set of data (or even the whole) out of the database</li>
<li>Syncing changes</li>
</ol>
<p>debezium has amazing solutions for both of these questions and it's scalable as the Kafka Connect scalable itself.</p>
<p>For getting data, it simply just take a snapshot of the current data in the database and ship this into Kafka. But for the CDC part, uses the Write-Ahead log (Binary Log in Mysql) for getting the changes from the leader node. Using WAL is a great alternative for triggers. It's blazingly fast, efficient, and scalable and it has no burden on the leader node's shoulder. The latency of having data in our Kafka Topic is as fast as the follower node getting the latest changes.</p>
<blockquote>
<p>If you don't know what Write-Ahead Log (WAL) is, you can think of it as a simple read-only file that has all the INSERT, UPDATE, and DELETE commands in the leader node. whenever a DML command sends to the database engine, it writes the command in the WAL file. So we can reply to all the changes by just running the commands in the file in order. It is worth noting to mention it's just a super-simplified explanation of it for better understanding. Database engines use more elegant solutions of how to use WAL files.</p>
</blockquote>
<p>For letting Debezium do its job, we need to change our Postgres config a little and add a couple of permissions. Debezium has a special  <a target="_blank" href="https://hub.docker.com/r/debezium/postgres">Docker image</a> based on the  <a target="_blank" href="https://hub.docker.com/_/postgres">Official Postgres image</a>  with a couple of small changes.</p>
<pre><code class="lang-yaml">  <span class="hljs-attr">product_db:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">debezium/postgres:13</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">productdb</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">POSTGRES_PASSWORD:</span> <span class="hljs-string">secret</span>
      <span class="hljs-attr">POSTGRES_DB:</span> <span class="hljs-string">product_db</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"5400:5432"</span>
</code></pre>
<p>Our final<code>docker-compose.yml</code> file should look like this:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">version:</span> <span class="hljs-string">"3.9"</span>
<span class="hljs-attr">services:</span>
  <span class="hljs-attr">product_db:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">debezium/postgres:13</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">productdb</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">POSTGRES_PASSWORD:</span> <span class="hljs-string">secret</span>
      <span class="hljs-attr">POSTGRES_DB:</span> <span class="hljs-string">product_db</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"5400:5432"</span>

  <span class="hljs-attr">zookeeper:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-zookeeper:6.2.1</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">zookeeper</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"2181:2181"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">ZOOKEEPER_CLIENT_PORT:</span> <span class="hljs-number">2181</span>
      <span class="hljs-attr">ZOOKEEPER_TICK_TIME:</span> <span class="hljs-number">2000</span>

  <span class="hljs-attr">broker:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-kafka:6.2.1</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">broker</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">zookeeper</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"29092:29092"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">KAFKA_BROKER_ID:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_ZOOKEEPER_CONNECT:</span> <span class="hljs-string">'zookeeper:2181'</span>
      <span class="hljs-attr">KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:</span> <span class="hljs-string">PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT</span>
      <span class="hljs-attr">KAFKA_ADVERTISED_LISTENERS:</span> <span class="hljs-string">PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092</span>
      <span class="hljs-attr">KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS:</span> <span class="hljs-number">0</span>

  <span class="hljs-attr">schema-registry:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">confluentinc/cp-schema-registry:6.2.1</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">schema-registry</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">broker</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8081:8081"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">SCHEMA_REGISTRY_HOST_NAME:</span> <span class="hljs-string">schema-registry</span>
      <span class="hljs-attr">SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS:</span> <span class="hljs-string">'broker:9092'</span>
      <span class="hljs-attr">SCHEMA_REGISTRY_LOG4J_ROOT_LOGLEVEL:</span> <span class="hljs-string">WARN</span>

  <span class="hljs-attr">connect:</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">connect</span>
    <span class="hljs-attr">build:</span>
      <span class="hljs-attr">context:</span> <span class="hljs-string">.</span>
      <span class="hljs-attr">dockerfile:</span> <span class="hljs-string">connect.Dockerfile</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">CONNECT_BOOTSTRAP_SERVERS:</span> <span class="hljs-string">'broker:9092'</span>
      <span class="hljs-attr">CONNECT_GROUP_ID:</span> <span class="hljs-string">'connect'</span>
      <span class="hljs-attr">CONNECT_CONFIG_STORAGE_TOPIC:</span> <span class="hljs-string">'data-transformer-config'</span>
      <span class="hljs-attr">CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_OFFSET_STORAGE_TOPIC:</span> <span class="hljs-string">'data-transformer-offset'</span>
      <span class="hljs-attr">CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_STATUS_STORAGE_TOPIC:</span> <span class="hljs-string">'data-transformer-status'</span>
      <span class="hljs-attr">CONNECT_STATUS_STORAGE_REPLICATION_FACTOR:</span> <span class="hljs-number">1</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE:</span> <span class="hljs-string">"false"</span>
      <span class="hljs-attr">CONNECT_VALUE_CONVERTER:</span> <span class="hljs-string">"org.apache.kafka.connect.json.JsonConverter"</span>
      <span class="hljs-attr">VALUE_CONVERTER_SCHEMAS_ENABLE:</span> <span class="hljs-string">"false"</span>
      <span class="hljs-attr">CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL:</span> <span class="hljs-string">"http://schema-registry:8081"</span>
      <span class="hljs-attr">CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL:</span> <span class="hljs-string">"http://schema-registry:8081"</span>
      <span class="hljs-attr">CONNECT_REST_ADVERTISED_HOST_NAME:</span> <span class="hljs-string">"localhost"</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">broker</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8083:8083"</span>
</code></pre>
<p>We can spin up all the services by running <code>docker-compose up -d</code>. You can change your setup by running <code>docker-compose ps</code>. What you'd see should look like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1641055401645/5nLWjvr8e.png" alt="Screenshot 2022-01-01 at 17.43.05.png" /></p>
<p>OK, if you still reading this tutorial, you are really serious to learn more about Kafka Connect. What we have done so far, is just running up a couple of services and wiring them up together. I mean, we really did nothing so far! fortunately, the rest is going to be easier because we delegate most of our work to the well-tested Kafka Connectors.</p>
<p>Now, it's time to tell the connector to do its job. If you look at the configuration, we still haven't told the connectors which database they should connect to! We have to define our connectors to "Kafka Connect" via the HTTP APIs provided for us. Each connector has a set of configurations that are kind of unique for that connector. It means for getting more details about the configurations, you need to refer to the connection's documentation. In this tutorial, we are using <strong>Debezium</strong> connector.</p>
<p>For defining configuration we can send an HTTP PUT request to Kafka Connect API <code>/connectors/{meaningful_connector_name}/config</code></p>
<pre><code class="lang-bash">curl --location --request PUT <span class="hljs-string">'http://localhost:8083/connectors/connector-debezium-product-001/config'</span> \
--header <span class="hljs-string">'Content-Type: application/json'</span> \
--data-raw <span class="hljs-string">'{
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "product_db",
    "database.port": "5432",
    "database.user": "postgres",
    "database.password": "secret",
    "database.dbname" : "product_db",
    "database.server.name": "fulfillment"
  }'</span>
</code></pre>
<p>This could be done with a POST request as well, but the PUT request acts like <code>InsertOrUpdate</code>, and since this operation is  <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Glossary/Idempotent">idempotent</a>, I prefer to use the PUT method.</p>
<p>Let's check everything has wired correctly so far by sending a GET request to <code>http://localhost:8083/connectors</code></p>
<pre><code class="lang-json">[
    <span class="hljs-string">"connector-debezium-product-001"</span>
]
</code></pre>
<p>You can get more details about each connector: </p>
<pre><code><span class="hljs-comment">// GET: http://localhost:8083/connectors/connector-debezium-product-001/status</span>
{
    <span class="hljs-attr">"name"</span>: <span class="hljs-string">"connector-debezium-product-001"</span>,
    <span class="hljs-attr">"connector"</span>: {
        <span class="hljs-attr">"state"</span>: <span class="hljs-string">"RUNNING"</span>,
        <span class="hljs-attr">"worker_id"</span>: <span class="hljs-string">"localhost:8083"</span>
    },
    <span class="hljs-attr">"tasks"</span>: [
        {
            <span class="hljs-attr">"id"</span>: <span class="hljs-number">0</span>,
            <span class="hljs-attr">"state"</span>: <span class="hljs-string">"RUNNING"</span>,
            <span class="hljs-attr">"worker_id"</span>: <span class="hljs-string">"localhost:8083"</span>
        }
    ],
    <span class="hljs-attr">"type"</span>: <span class="hljs-string">"source"</span>
}
</code></pre><p>If there are data available in the database, the connector has started to ship the data as it spun up. The database was empty so it's just ready for capturing changes in the source database.
Let's create an application that simply inserts data into our database every second. I'm using  <a target="_blank" href="https://kotlinlang.org/">Kotlin</a> for this application. Since this tutorial is not about learning Kotlin, I'm just quickly mentioning the library that I used for that.
It's a good idea to use meaningful data even in testing. it helps in debugging especially when you are working on a distributed system that needs to follow the data along with different systems. I used <a target="_blank" href="https://github.com/serpro69/kotlin-faker">Faker</a> to generate relatively meaningful fake data. For connecting and working the database I used Jetbrains newly developed ORM named  <a target="_blank" href="https://github.com/JetBrains/Exposed">Exposed</a>  </p>
<pre><code class="lang-kotlin"><span class="hljs-comment">// build.gradle.kts</span>
plugins {
    kotlin(<span class="hljs-string">"jvm"</span>) version <span class="hljs-string">"1.6.10"</span>
}

group = <span class="hljs-string">"com.devmarkpro.connector"</span>
version = <span class="hljs-string">"1.0-SNAPSHOT"</span>

repositories {
    mavenCentral()
}

dependencies {
    implementation(kotlin(<span class="hljs-string">"stdlib"</span>))
    implementation(<span class="hljs-string">"org.jetbrains.exposed"</span>, <span class="hljs-string">"exposed-core"</span>, <span class="hljs-string">"0.37.3"</span>)
    implementation(<span class="hljs-string">"org.jetbrains.exposed"</span>, <span class="hljs-string">"exposed-dao"</span>, <span class="hljs-string">"0.37.3"</span>)
    implementation(<span class="hljs-string">"org.jetbrains.exposed"</span>, <span class="hljs-string">"exposed-jdbc"</span>, <span class="hljs-string">"0.37.3"</span>)
    implementation(<span class="hljs-string">"org.postgresql:postgresql:42.2.2"</span>)
    implementation(<span class="hljs-string">"io.github.serpro69:kotlin-faker:1.9.0"</span>)
}
</code></pre>
<p><a target="_blank" href="https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.concurrent/fixed-rate-timer.html">fixedRateTimer</a> uses for running a function in a period of time. I set it to fire every 1000 milliseconds (every second)</p>
<pre><code class="lang-kotlin"><span class="hljs-keyword">import</span> io.github.serpro69.kfaker.faker
<span class="hljs-keyword">import</span> org.jetbrains.exposed.dao.id.IntIdTable
<span class="hljs-keyword">import</span> org.jetbrains.exposed.sql.*
<span class="hljs-keyword">import</span> org.jetbrains.exposed.sql.transactions.transaction
<span class="hljs-keyword">import</span> kotlin.concurrent.fixedRateTimer
<span class="hljs-keyword">import</span> kotlin.random.Random

<span class="hljs-keyword">object</span> Product : IntIdTable() {
    <span class="hljs-keyword">val</span> name = varchar(<span class="hljs-string">"name"</span>, <span class="hljs-number">500</span>)
    <span class="hljs-keyword">val</span> price = double(<span class="hljs-string">"price"</span>).default(<span class="hljs-number">0.0</span>)
    <span class="hljs-keyword">val</span> color = varchar(<span class="hljs-string">"color"</span>, <span class="hljs-number">50</span>).nullable()
    <span class="hljs-keyword">val</span> category = varchar(<span class="hljs-string">"category"</span>, <span class="hljs-number">50</span>)
    <span class="hljs-keyword">var</span> quantity = integer(<span class="hljs-string">"quantity"</span>).default(<span class="hljs-number">0</span>)
    <span class="hljs-keyword">val</span> description = varchar(<span class="hljs-string">"description"</span>, <span class="hljs-number">1000</span>).nullable()
    <span class="hljs-keyword">val</span> internalId = varchar(<span class="hljs-string">"internalId"</span>, <span class="hljs-number">50</span>)
}

<span class="hljs-function"><span class="hljs-keyword">fun</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    <span class="hljs-keyword">val</span> db = Database.connect(
        <span class="hljs-string">"jdbc:postgresql://127.0.0.1:5400/product_db"</span>,
        driver = <span class="hljs-string">"org.postgresql.Driver"</span>,
        user = <span class="hljs-string">"postgres"</span>,
        password = <span class="hljs-string">"secret"</span>,
    )
    <span class="hljs-keyword">val</span> faker = faker{}
    <span class="hljs-function"><span class="hljs-keyword">fun</span> <span class="hljs-title">toss</span><span class="hljs-params">()</span></span>: <span class="hljs-built_in">Boolean</span> = Random.nextDouble() &lt; <span class="hljs-number">0.5</span>
    fixedRateTimer(<span class="hljs-string">"timer"</span>, <span class="hljs-literal">false</span>, <span class="hljs-number">0L</span>, <span class="hljs-number">1000</span>) {
        transaction(db) {
            addLogger(StdOutSqlLogger)
            SchemaUtils.create(Product)
            <span class="hljs-keyword">val</span> productName = faker.commerce.productName()
            <span class="hljs-keyword">val</span> productId = Product.insert {
                it[name] = productName
                it[price] = Random.nextDouble()
                it[color] = faker.color.name()
                it[category] = faker.commerce.department()
                it[quantity] = Random.nextInt(<span class="hljs-number">0</span>, <span class="hljs-number">100</span>)
                it[description] = <span class="hljs-keyword">if</span> (toss()) faker.lorem.punctuation() <span class="hljs-keyword">else</span> <span class="hljs-literal">null</span>
                it[internalId] = faker.code.asin()
            } <span class="hljs-keyword">get</span> Product.id
            println(<span class="hljs-string">"product <span class="hljs-variable">$productId</span> : <span class="hljs-variable">$productName</span> inserted"</span>)
        }
    }
}
</code></pre>
<p>By running the application, Kafka Connect starts pushing changes to the Kafka Cluster.
Let's look at the data that we have in Kafka. First, let's take a look at the Kafka topics in our cluster by running <code>docker-compose exec broker kafka-topics --describe --zookeeper zookeeper:2181</code>. this command probably shows a lot of topics in Kafka but most of them use internally for Kafka cluster and Kafka connect. The topics that we are looking for is related to the <code>Product Table</code> so let's filter the output:</p>
<p><code>docker-compose exec broker kafka-topics --describe --zookeeper zookeeper:2181 | grep product</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1641058004473/BFlhdMFZk.png" alt="Screenshot 2022-01-01 at 18.26.34.png" /></p>
<p>Your output might be a little bit different than mine, but there must be a topic that the name contains the product! now let's consume the topic and see what data is flowing in it.
There are a couple of ways to consume a Kafka topic, you can use one of the Kafka GUI out there, you can write a simple application to consume the topic and print received data in a file or in the console. In this tutorial, we are using <code>kafka-console-consumer</code> command-line tool for doing so by executing <code>docker-compose exec broker kafka-console-consumer --bootstrap-server broker:9092 --topic fulfillment.public.product</code> This is a long-running command and keeps consuming the topic unless to send a <code>SIGINT</code> signal. Anyway, you should see the data is coming into the Kafka topic.
If you look at the data that prints out to the console you'd see there are a couple of cool things inside the data. Debezium is not just sending the data, it sends the schema of the data as well, that's because altering table commands is not causing a separate message so it's always a good idea to match the data with the schema.</p>
<p>In the payload field you can see <code>before</code> and <code>after</code> properties in our case <code>before</code> is null as we just insert the data but if it's an update command you'll get both versions of the data.</p>
<p>There is also possible to modify the data a little or drop a specific column before shipping it to Kafka. Maybe in the future, I will prepare a separate article for these cases as well.</p>
<p>Here's a complete example of the file:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"schema"</span>: {
        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"struct"</span>,
        <span class="hljs-attr">"fields"</span>: [
            {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"struct"</span>,
                <span class="hljs-attr">"fields"</span>: [
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int32"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"default"</span>: <span class="hljs-number">0</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"id"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"name"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"double"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"default"</span>: <span class="hljs-number">0</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"price"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"color"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"category"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int32"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"default"</span>: <span class="hljs-number">0</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"quantity"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"description"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"internalId"</span>
                    }
                ],
                <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                <span class="hljs-attr">"name"</span>: <span class="hljs-string">"fulfillment.public.product.Value"</span>,
                <span class="hljs-attr">"field"</span>: <span class="hljs-string">"before"</span>
            },
            {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"struct"</span>,
                <span class="hljs-attr">"fields"</span>: [
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int32"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"default"</span>: <span class="hljs-number">0</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"id"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"name"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"double"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"default"</span>: <span class="hljs-number">0</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"price"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"color"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"category"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int32"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"default"</span>: <span class="hljs-number">0</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"quantity"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"description"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"internalId"</span>
                    }
                ],
                <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                <span class="hljs-attr">"name"</span>: <span class="hljs-string">"fulfillment.public.product.Value"</span>,
                <span class="hljs-attr">"field"</span>: <span class="hljs-string">"after"</span>
            },
            {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"struct"</span>,
                <span class="hljs-attr">"fields"</span>: [
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"version"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"connector"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"name"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"ts_ms"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"io.debezium.data.Enum"</span>,
                        <span class="hljs-attr">"version"</span>: <span class="hljs-number">1</span>,
                        <span class="hljs-attr">"parameters"</span>: {
                            <span class="hljs-attr">"allowed"</span>: <span class="hljs-string">"true,last,false"</span>
                        },
                        <span class="hljs-attr">"default"</span>: <span class="hljs-string">"false"</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"snapshot"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"db"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"sequence"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"schema"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"table"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"txId"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"lsn"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"xmin"</span>
                    }
                ],
                <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                <span class="hljs-attr">"name"</span>: <span class="hljs-string">"io.debezium.connector.postgresql.Source"</span>,
                <span class="hljs-attr">"field"</span>: <span class="hljs-string">"source"</span>
            },
            {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                <span class="hljs-attr">"field"</span>: <span class="hljs-string">"op"</span>
            },
            {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                <span class="hljs-attr">"field"</span>: <span class="hljs-string">"ts_ms"</span>
            },
            {
                <span class="hljs-attr">"type"</span>: <span class="hljs-string">"struct"</span>,
                <span class="hljs-attr">"fields"</span>: [
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"string"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"id"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"total_order"</span>
                    },
                    {
                        <span class="hljs-attr">"type"</span>: <span class="hljs-string">"int64"</span>,
                        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
                        <span class="hljs-attr">"field"</span>: <span class="hljs-string">"data_collection_order"</span>
                    }
                ],
                <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">true</span>,
                <span class="hljs-attr">"field"</span>: <span class="hljs-string">"transaction"</span>
            }
        ],
        <span class="hljs-attr">"optional"</span>: <span class="hljs-literal">false</span>,
        <span class="hljs-attr">"name"</span>: <span class="hljs-string">"fulfillment.public.product.Envelope"</span>
    },
    <span class="hljs-attr">"payload"</span>: {
        <span class="hljs-attr">"before"</span>: <span class="hljs-literal">null</span>,
        <span class="hljs-attr">"after"</span>: {
            <span class="hljs-attr">"id"</span>: <span class="hljs-number">132</span>,
            <span class="hljs-attr">"name"</span>: <span class="hljs-string">"Incredible Iron Pants"</span>,
            <span class="hljs-attr">"price"</span>: <span class="hljs-number">0.61987811851418</span>,
            <span class="hljs-attr">"color"</span>: <span class="hljs-string">"violet"</span>,
            <span class="hljs-attr">"category"</span>: <span class="hljs-string">"Tools"</span>,
            <span class="hljs-attr">"quantity"</span>: <span class="hljs-number">41</span>,
            <span class="hljs-attr">"description"</span>: <span class="hljs-string">"?"</span>,
            <span class="hljs-attr">"internalId"</span>: <span class="hljs-string">"B0009PC1XA"</span>
        },
        <span class="hljs-attr">"source"</span>: {
            <span class="hljs-attr">"version"</span>: <span class="hljs-string">"1.7.1.Final"</span>,
            <span class="hljs-attr">"connector"</span>: <span class="hljs-string">"postgresql"</span>,
            <span class="hljs-attr">"name"</span>: <span class="hljs-string">"fulfillment"</span>,
            <span class="hljs-attr">"ts_ms"</span>: <span class="hljs-number">1641025283281</span>,
            <span class="hljs-attr">"snapshot"</span>: <span class="hljs-string">"false"</span>,
            <span class="hljs-attr">"db"</span>: <span class="hljs-string">"product_db"</span>,
            <span class="hljs-attr">"sequence"</span>: <span class="hljs-string">"[\"23982504\",\"23982504\"]"</span>,
            <span class="hljs-attr">"schema"</span>: <span class="hljs-string">"public"</span>,
            <span class="hljs-attr">"table"</span>: <span class="hljs-string">"product"</span>,
            <span class="hljs-attr">"txId"</span>: <span class="hljs-number">622</span>,
            <span class="hljs-attr">"lsn"</span>: <span class="hljs-number">23982504</span>,
            <span class="hljs-attr">"xmin"</span>: <span class="hljs-literal">null</span>
        },
        <span class="hljs-attr">"op"</span>: <span class="hljs-string">"c"</span>,
        <span class="hljs-attr">"ts_ms"</span>: <span class="hljs-number">1641025283552</span>,
        <span class="hljs-attr">"transaction"</span>: <span class="hljs-literal">null</span>
    }
}
</code></pre>
]]></content:encoded></item><item><title><![CDATA[Generic in Golang]]></title><description><![CDATA[This year started with new news from the golang team,  "Adding Generic!".
If you have familiar with languages like Java or C#, you already familiar with the concept of Generics. For example, let's talk about sorting an array.
package main

import (
 ...]]></description><link>https://devmarkpro.com/generic-in-golang</link><guid isPermaLink="true">https://devmarkpro.com/generic-in-golang</guid><category><![CDATA[Go Language]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Wed, 17 Mar 2021 15:57:57 GMT</pubDate><content:encoded><![CDATA[<p>This year started with new news from the golang team,  <a target="_blank" href="https://blog.golang.org/generics-proposal">"Adding Generic!"</a>.</p>
<p>If you have familiar with languages like Java or C#, you already familiar with the concept of Generics. For example, let's talk about sorting an array.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"fmt"</span>
    <span class="hljs-string">"sort"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    f := <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(msg <span class="hljs-keyword">string</span>, v []<span class="hljs-keyword">int</span>)</span></span> {
        fmt.Println(msg)
        <span class="hljs-keyword">for</span> _, x := <span class="hljs-keyword">range</span> v {
            fmt.Printf(<span class="hljs-string">"%d "</span>, x)
        }
        fmt.Println()
    }
    v := []<span class="hljs-keyword">int</span>{<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">100</span>, <span class="hljs-number">2</span>, <span class="hljs-number">-1</span>, <span class="hljs-number">80</span>}
    f(<span class="hljs-string">"before sort"</span>, v)
    sort.Ints(v)
    f(<span class="hljs-string">"after sort"</span>, v)
}
</code></pre>
<p>In the above code, we just used the <code>sort</code> package to sorting a slice of integers and print them out.</p>
<p>Everything looks good, right? So what if we wanted to sort a slice of other numeric types? The <code>sort</code> package has another function called  <a target="_blank" href="https://golang.org/pkg/sort/#Float64s">Float64s</a>  (instead of <code>Ints</code>) but it's definitely not enough because we might need to do some sorting operation on other numeric types like <code>int32</code>.</p>
<p>So one approach is to define one function for each numeric type but we all know it's not the best idea! it's the place that <code>Generic</code> comes to picture.</p>
<p>Here's the definition of Generic in  <a target="_blank" href="https://en.wikipedia.org/wiki/Generic_programming">Wikipedia</a> :</p>
<blockquote>
<p>Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. </p>
</blockquote>
<p>So the point of generic is, specify the type of the variable later and provide it as a parameter. That's exactly what we need for example in our sort function. The only thing that we need is to make sure our input <strong>comply</strong> our conditions. for example in the sorting scenario, it must be <code>numeric</code>.</p>
<p>Let's see how we can use generic in golang:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Bored</span>[<span class="hljs-title">T</span> <span class="hljs-title">any</span>]<span class="hljs-params">(s ...T)</span> <span class="hljs-title">T</span></span> {
    <span class="hljs-keyword">return</span> s[<span class="hljs-number">0</span>]
}
</code></pre>
<p><code>Bored</code> function gives a slice of <code>T</code> as input and returns the first item in the slice as output. So WTH is <code>T</code>?
 first thing first, let's see how to use our <code>Bored</code> function:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    output := Bored(<span class="hljs-string">"Hello"</span>, <span class="hljs-string">"Generics"</span>)
    fmt.Println(output)
}
</code></pre>
<p>That's great, a generic function is used exactly like a normal function! so let's talk about <code>T</code>: In this case, <code>T</code> is <code>string</code>. so what's the difference between generic and the old-fashioned go <code>interface{}</code>? I'll explain it with an example.</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Bored</span><span class="hljs-params">(s ...<span class="hljs-keyword">interface</span>{})</span> <span class="hljs-title">interface</span></span>{} {
    <span class="hljs-keyword">return</span> <span class="hljs-string">"I am string"</span>
}
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    output := Bored(<span class="hljs-number">1</span>,<span class="hljs-number">1.1</span>,<span class="hljs-string">"hello"</span>, <span class="hljs-keyword">struct</span>{}{})
    fmt.Println(output)
}
</code></pre>
<p>The difference is, <code>interface{}</code> could be anything but in the first function the type of <code>T</code> determined while passing the first element to it. if it's a string, the rest of them also must be a string. otherwise, you'll get an error while compiling the application.</p>
<pre><code class="lang-go">
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Bored</span>[<span class="hljs-title">T</span> <span class="hljs-title">any</span>]<span class="hljs-params">(s ...T)</span> <span class="hljs-title">T</span></span> {
    <span class="hljs-keyword">return</span> s[<span class="hljs-number">0</span>]
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    output := Bored(<span class="hljs-string">"Hello"</span>, <span class="hljs-string">"Generics"</span>, <span class="hljs-number">1</span>)
    fmt.Println(output)
}
</code></pre>
<p>output:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">type</span> checking failed <span class="hljs-keyword">for</span> main
prog.go2:11:39: default <span class="hljs-built_in">type</span> int of 1 does not match inferred <span class="hljs-built_in">type</span> string <span class="hljs-keyword">for</span> T
</code></pre>
<p>It's nice, isn't it? for Java or C# developers something like <code>Repository&lt;T&gt;</code> is pretty familiar which <code>T</code> is almost always one of the application's entities.</p>
<p>Back to <code>Golang</code>, the other type (instead of <code>any</code>) that introduced now, is <code>comparable</code>. It means the item must have the capability of being in <code>equality operation</code>.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"fmt"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    fmt.Println(Equal(<span class="hljs-number">1</span>, <span class="hljs-number">1</span>)) <span class="hljs-comment">// true</span>
    fmt.Println(Equal(<span class="hljs-number">2</span>, <span class="hljs-number">1</span>)) <span class="hljs-comment">// false</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Equal</span>[<span class="hljs-title">T</span> <span class="hljs-title">comparable</span>] <span class="hljs-params">(input1, input2 T)</span> <span class="hljs-title">bool</span></span> {
    <span class="hljs-keyword">return</span> input1 == input2
}
</code></pre>
<p>and same previous example we got a compile-time if pass multiple data type to this function:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    fmt.Println(Equal(<span class="hljs-number">1.1</span>, <span class="hljs-number">1</span>))
}
<span class="hljs-comment">/*
output: type checking failed for main
prog.go2:8:25: default type int of 1 does not match inferred type float64 for T
*/</span>
</code></pre>
<p>I think the golang team provides some ways for developers to define the restriction of the type <code>T</code> themselves. for now, I just have seen <code>any</code> and <code>comparable</code>, if you know more typed parameter please  share them in the comment section;</p>
]]></content:encoded></item><item><title><![CDATA[Working with big files in golang]]></title><description><![CDATA[Today, I am going to show you how to read files in golang line-by-line. Let's imagine that have a jsonl file. What's jsonl? it's json lines in a simple term, it's a file that each line of it represents a valid json object. So if we read the file line...]]></description><link>https://devmarkpro.com/working-big-files-golang</link><guid isPermaLink="true">https://devmarkpro.com/working-big-files-golang</guid><category><![CDATA[Go Language]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Sat, 13 Mar 2021 23:00:00 GMT</pubDate><content:encoded><![CDATA[<p>Today, I am going to show you how to read files in golang <strong>line-by-line</strong>. Let's imagine that have a <code>jsonl</code> file. What's <a target="_blank" href="https://jsonlines.org/">jsonl</a>? it's <strong>json lines</strong> in a simple term, it's a file that each line of it represents a valid <code>json</code> object. So if we read the file line by line, we can Marshal/Unmarshal each line of it separately. Here's an example of a <code>jsonl</code> file.</p>
<p>Each line of this file represents the data of a world cup.</p>
<pre><code class="lang-json">{<span class="hljs-attr">"year"</span>:<span class="hljs-string">"2018"</span>,<span class="hljs-attr">"host"</span>:<span class="hljs-string">"Russia"</span>,<span class="hljs-attr">"winner"</span>:<span class="hljs-string">"France"</span>}
{<span class="hljs-attr">"year"</span>:<span class="hljs-string">"2014"</span>,<span class="hljs-attr">"host"</span>:<span class="hljs-string">"Brazil"</span>,<span class="hljs-attr">"winner"</span>:<span class="hljs-string">"Germany"</span>}
{<span class="hljs-attr">"year"</span>:<span class="hljs-string">"2010"</span>,<span class="hljs-attr">"host"</span>:<span class="hljs-string">"South Africa"</span>,<span class="hljs-attr">"winner"</span>:<span class="hljs-string">"Spain"</span>}
{<span class="hljs-attr">"year"</span>:<span class="hljs-string">"2006"</span>,<span class="hljs-attr">"host"</span>:<span class="hljs-string">"Germany"</span>,<span class="hljs-attr">"winner"</span>:<span class="hljs-string">"Italy"</span>}
</code></pre>
<h2 id="working-with-bufioscanner">Working with bufio.Scanner</h2>
<p>So let’s read this file line by line most easily and conveniently. the easiest way to read a file (at least for me) is using the <a target="_blank" href="https://golang.org/pkg/bufio/#Scanner">scanner</a> from the <strong>bufio</strong> package in the standard library. First, we need to create an instance with the <code>NewScanner</code> function which is a very familiar way for constructing the structs in golang. this function accepts a Reader interface as input, and the good news is <code>os.File</code> implemented this interface. It means, we can open a file and pass the pointer of the file to bufio.NewScanner. Let’s see it in action.</p>
<pre><code class="lang-go"><span class="hljs-comment">// first open the file</span>
file, err := os.Open(<span class="hljs-string">"/Users/Mark/fifa-winners.jsonl"</span>)
<span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
    log.Fatalf(<span class="hljs-string">"could not open the file: %v"</span>, err)
}
<span class="hljs-comment">// don't forget to close the file.</span>
<span class="hljs-keyword">defer</span> file.Close()
<span class="hljs-comment">// finally, we can have our scanner</span>
scanner := bufio.NewScanner(file)
</code></pre>
<p>So, we have the scanner, we are ready to go... scanner has a function named <strong>Scan</strong>. this function moves the scanner to the next token. I'll tell you what it means but for now, let's say each time <strong>Scan</strong> called, we read one line of our file. So if we want to move the scanner all through the file, we'd call the scan function in an infinite loop! The question is <strong>How do we know, we can break the loop?</strong> It's easy! Scan returns true as the return unless it meets the end of the file.</p>
<pre><code class="lang-go"><span class="hljs-keyword">for</span> {
    <span class="hljs-keyword">if</span> scanner.Scan() {
        <span class="hljs-comment">// we have a new line in each iteration</span>
        <span class="hljs-keyword">continue</span>
    }
    <span class="hljs-comment">// we are done let's break the loop</span>
    <span class="hljs-keyword">break</span>
}
<span class="hljs-comment">// the rest of our spaghetti</span>
</code></pre>
<p>This code works, but have you know we can say golang to keep a loop running until met a specific condition. All the code above can be as simple as the code below:</p>
<pre><code class="lang-go"><span class="hljs-keyword">for</span> scanner.Scan() {
    <span class="hljs-comment">// we have a new line in each iteration</span>
}
<span class="hljs-comment">// the rest of our spaghetti</span>
</code></pre>
<p>Well, Let's get back to the point! we can have our <strong>bytes</strong> or <strong>string</strong> in each line easily by calling <code>Bytes()</code> and <code>Text()</code> functions.</p>
<pre><code class="lang-go"><span class="hljs-keyword">for</span> scanner.Scan() {
  <span class="hljs-comment">// b is an array of bytes ([]byte)</span>
  b := scanner.Bytes()
  <span class="hljs-comment">// s is string</span>
  s := scanner.Text()
}
</code></pre>
<p>Frankly, These functions are the same! for example <code>string(scanner.Bytes())</code> will give you the same result and that's what exactly happens in the <code>Text()</code> function.</p>
<p>We read our file, so is the mission completed? <strong>Not exactly</strong>, because we didn't handle any error yet. </p>
<p>the scanner has another function called <code>Err()</code>. This function gives you the <strong>first</strong> error that happened during the scan process. It means, when the scanner trying to move through the file, the <em>Scan</em> function returns false, if something bad happened. So our loop instantly breaks and we will out of the loop. Now we can get that error and deal with it. </p>
<p>If we want to know in each line of the file that error happened, we should use a traditional way, we know the scanner starts from the beginning of the file (line number 1) so we can define a variable outside of the loop that represents the line number and increase it in each iteration.</p>
<pre><code class="lang-go">lineNumber := <span class="hljs-number">0</span>
<span class="hljs-keyword">for</span> scanner.Scan() {
    lineNumber++
        fmt.Println(scanner.Text())
}
<span class="hljs-comment">// the rest of our spaghetti</span>
<span class="hljs-keyword">if</span> err := scanner.Err(); err != <span class="hljs-literal">nil</span> {
    log.Fatalf(<span class="hljs-string">"something bad happened in the line %v: %v"</span>, lineNumber, err)
}
</code></pre>
<p>another thing that we should consider about the <code>Err()</code> function is it ignores the <code>io.EOF</code> so if we will give an error, it's a REAL one!</p>
<p>Let's have a run:</p>
<pre><code class="lang-bash">➜ big-files (main) ✗ go run main.go
{<span class="hljs-string">"year"</span>:<span class="hljs-string">"2018"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"Russia"</span>,<span class="hljs-string">"winner"</span>:<span class="hljs-string">"France"</span>}
{<span class="hljs-string">"year"</span>:<span class="hljs-string">"2014"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"Brazil"</span>,<span class="hljs-string">"winner"</span>:<span class="hljs-string">"Germany"</span>}
{<span class="hljs-string">"year"</span>:<span class="hljs-string">"2010"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"South Africa"</span>,<span class="hljs-string">"winner"</span>:<span class="hljs-string">"Spain"</span>}
{<span class="hljs-string">"year"</span>:<span class="hljs-string">"2006"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"Germany"</span>,<span class="hljs-string">"winner"</span>:<span class="hljs-string">"Italy"</span>}
</code></pre>
<p>It worked, So what's next?</p>
<h2 id="fix-bufioscanner-token-too-long-error">Fix bufio.Scanner: token too long error</h2>
<p>We said, each line of a jsonl file represents a valid json, so it could be too long. We also said the <code>Scan</code> function moves the scanner to the next token but the question is where is the next token!?</p>
<p>The scanner function has another method that is not as famous as its siblings, <code>Buffer</code> and it needs a buffer and an integer as input and you can set the maximum size of the buffer with this function.</p>
<p><strong>bufio</strong> package has a maximum token size which equals <code>64 * 1024</code> (~65.6kb). So if one line of our lines is bigger than this size, we got this error <code>token too long error</code>. </p>
<p>We found the answer to our question: The next token is where the scanner reaches max size (default 65kb) <strong>OR</strong> the end of the line.</p>
<h3 id="approach-1-bigger-buffer-size">Approach 1: Bigger buffer size</h3>
<p>The first approach to tackling this problem is to increase the buffer size. Actually, the name <code>bufio.MaxScanTokenSize</code> is a little misleading because it's not the actual maximum it's <strong>THE DEFAULT MAXIMUM</strong> size. so we can increase it.</p>
<pre><code class="lang-go">buf := []<span class="hljs-keyword">byte</span>{}
scanner := bufio.NewScanner(file)
<span class="hljs-comment">// increase the buffer size to 2Mb</span>
scanner.Buffer(buf, <span class="hljs-number">2048</span>*<span class="hljs-number">1024</span>)
</code></pre>
<p>Now we can process <code>jsonl</code> files with lines up to 2Mb. It's good but what if we need more? We can increase this number as much as we want (probably) but if our file has <em>5.000.000</em> rows and just one of them is 100Mb we need to increase our scanner to this size just for one line, or use another approach!</p>
<h3 id="approach2">Approach2</h3>
<p>the next way to read such a tough file! is using another function! Bufio (buffer-io) gave us way more than a simple scanner to work with files and we have to choose one of them based on our needs and requirements. in this case, Scanner cannot satisfy what we need so let's take a look at <code>ReadLine</code> function of <code>bufio.Reader</code>. It's a little bit lower level than the scanner. generally speaking, when you hear the word <code>lower-level</code> you should do more for simple things but you have more access and power!</p>
<p>So let's get started. First we need a reader:</p>
<pre><code class="lang-go">reader := bufio.NewReader(file)
</code></pre>
<p>reader, has <strong>ReadLine</strong> function which <em>tries</em> to read the entire line. Just like the scanner, we need to call this function in a for loop but since we are at the <em>lower level</em>! we don't have a nice-easy boolean in return anymore to know that we can break the loop.</p>
<p>the other difference is the error that we will give from the <code>ReadLine</code> function, which can also be <code>io.EOF</code>. It's not going to be a real error for us, so we have to handle it too.</p>
<pre><code class="lang-go">reader := bufio.NewReader(file)
<span class="hljs-keyword">for</span> {
    line, _, err := reader.ReadLine()
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">if</span> err == io.EOF {
            <span class="hljs-keyword">break</span>
        }
        log.Fatalf(<span class="hljs-string">"a real error happened here: %v\n"</span>, err)
    }
    fmt.Println(<span class="hljs-keyword">string</span>(line))
}
</code></pre>
<p>As you probably already know, We just read the file so far, we did actually solve the problem that we had with the gigantic lines.</p>
<p>we ignore the second parameter that we gave from the ReadLine function and that one is what we exactly need to solve our problem. It's a boolean named <code>isPrefix</code>. If the line is too long and <code>ReadLine</code> cannot put all of its content in the buffer, It returns the filled buffer and set isPrefix to true which means we will give the <strong>next part</strong> of the line in the next call of the <code>ReadLine</code> function.</p>
<p>So we just need to call the ReadLine function until <code>isPrefix</code> becomes <code>false</code> then we can go for the next line of our file. You probably already noticed that we are talking about a recursive function. First I define the function that we want to call recursively.</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">read</span><span class="hljs-params">(r *bufio.Reader)</span> <span class="hljs-params">([]<span class="hljs-keyword">byte</span>, error)</span></span> {
    <span class="hljs-keyword">var</span> (
        isPrefix = <span class="hljs-literal">true</span>
        err      error
        line, ln []<span class="hljs-keyword">byte</span>
    )

    <span class="hljs-keyword">for</span> isPrefix &amp;&amp; err == <span class="hljs-literal">nil</span> {
        line, isPrefix, err = r.ReadLine()
        ln = <span class="hljs-built_in">append</span>(ln, line...)
    }

    <span class="hljs-keyword">return</span> ln, err
}
</code></pre>
<p><code>isPrefix</code> is true at the first place and error is also nil so we make sure the for loop will run at least one time. It behaves like the do-while loop. We re-assign variables inside the loop so we call <code>r.ReadLine</code> unless we got an error OR <code>isPrefix</code> is false. in each iteration, we append the bytes that we get from <code>r.ReadLine()</code> to another variable. Now it's time to call this function inside the main function.</p>
<pre><code class="lang-go">reader := bufio.NewReader(file)
<span class="hljs-keyword">for</span> {
    line, err := read(reader)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">if</span> err == io.EOF {
            <span class="hljs-keyword">break</span>
        }
        log.Fatalf(<span class="hljs-string">"a real error happened here: %v\n"</span>, err)
    }
    fmt.Println(<span class="hljs-keyword">string</span>(line))
}
</code></pre>
<p>That's it! We solve the problem. here's the complete code:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"bufio"</span>
    <span class="hljs-string">"fmt"</span>
    <span class="hljs-string">"io"</span>
    <span class="hljs-string">"log"</span>
    <span class="hljs-string">"os"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    <span class="hljs-comment">// first open the file</span>
    file, err := os.Open(<span class="hljs-string">"./fifa-winners.jsonl"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        log.Fatalf(<span class="hljs-string">"could not open the file: %v"</span>, err)
    }
    <span class="hljs-keyword">defer</span> file.Close()
    log.Println(<span class="hljs-string">"******************* READ WITH SCANNER *******************"</span>)
    readWithScanner(file)
    log.Println(<span class="hljs-string">"******************* READ WITH READLINE() *******************"</span>)

    <span class="hljs-comment">// we just reset the offset. because we read this file once</span>
    <span class="hljs-comment">// imagine the cursor is in the end of the file so we have to get back to the first line and read it again </span>
    file.Seek(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>)
    readWithReadLine(file)

    log.Println(<span class="hljs-string">"we read a file twice!"</span>)
}

<span class="hljs-comment">// Read with simple scanner</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">readWithScanner</span><span class="hljs-params">(file *os.File)</span></span> {
    <span class="hljs-comment">// first open the file</span>
    file, err := os.Open(<span class="hljs-string">"./fifa-winners.jsonl"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        log.Fatalf(<span class="hljs-string">"could not open the file: %v"</span>, err)
    }
    <span class="hljs-comment">// finally, we can have our scanner</span>
    buf := []<span class="hljs-keyword">byte</span>{}
    scanner := bufio.NewScanner(file)
    scanner.Buffer(buf, <span class="hljs-number">2048</span>*<span class="hljs-number">1024</span>)
    lineNumber := <span class="hljs-number">1</span>
    <span class="hljs-keyword">for</span> scanner.Scan() {
        fmt.Println(scanner.Text())
        lineNumber++
    }
    <span class="hljs-comment">// the rest of our spaghetti</span>
    <span class="hljs-keyword">if</span> err := scanner.Err(); err != <span class="hljs-literal">nil</span> {
        log.Fatalf(<span class="hljs-string">"something bad happened in the line %v: %v"</span>, lineNumber, err)
    }
}

<span class="hljs-comment">// Read with Readline function</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">read</span><span class="hljs-params">(r *bufio.Reader)</span> <span class="hljs-params">([]<span class="hljs-keyword">byte</span>, error)</span></span> {
    <span class="hljs-keyword">var</span> (
        isPrefix = <span class="hljs-literal">true</span>
        err      error
        line, ln []<span class="hljs-keyword">byte</span>
    )

    <span class="hljs-keyword">for</span> isPrefix &amp;&amp; err == <span class="hljs-literal">nil</span> {
        line, isPrefix, err = r.ReadLine()
        ln = <span class="hljs-built_in">append</span>(ln, line...)
    }

    <span class="hljs-keyword">return</span> ln, err
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">readWithReadLine</span><span class="hljs-params">(file *os.File)</span></span> {
    reader := bufio.NewReader(file)
    <span class="hljs-keyword">for</span> {
        line, err := read(reader)
        <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
            <span class="hljs-keyword">if</span> err == io.EOF {
                <span class="hljs-keyword">break</span>
            }
            log.Fatalf(<span class="hljs-string">"a real error happened here: %v\n"</span>, err)
        }
        fmt.Println(<span class="hljs-keyword">string</span>(line))
    }
}
</code></pre>
]]></content:encoded></item><item><title><![CDATA[How to embed the content of a file to a variable in Golang]]></title><description><![CDATA[In this article, we were talking about how to deal with a large file in Golang. But have you ever seen yourself in a situation that just wanted to read a file and load its entire content to a variable and work with it? I have been in the same situati...]]></description><link>https://devmarkpro.com/embed-file-variable-golang</link><guid isPermaLink="true">https://devmarkpro.com/embed-file-variable-golang</guid><category><![CDATA[Go Language]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Thu, 11 Mar 2021 23:00:00 GMT</pubDate><content:encoded><![CDATA[<p>In <a target="_blank" href="https://devmarkpro.com/working-big-files-golang/">this article</a>, we were talking about how to deal with a large file in Golang. But have you ever seen yourself in a situation that just wanted to read a file and load its entire content to a variable and work with it? I have been in the same situation a lot. For example for reading a specific configuration from a file and somehow, this file is always there! I mean you know that you need that file to build/run your application. Let's talk about it in code, Imagine we have a file named <code>config.txt</code> with the content below:</p>
<pre><code>I am a file that you need me ;)
</code></pre><p>And wanted to read this file.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"fmt"</span>
    <span class="hljs-string">"io/ioutil"</span>
    <span class="hljs-string">"log"</span>
    <span class="hljs-string">"os"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    file, err := os.Open(<span class="hljs-string">"config.txt"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        log.Fatalf(<span class="hljs-string">"missing config file: %v"</span>, err)
    }
    content, err := ioutil.ReadAll(file)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        log.Fatalf(<span class="hljs-string">"could not read config file: %v"</span>, err)
    }
    fmt.Println(<span class="hljs-keyword">string</span>(content))
}
<span class="hljs-comment">// output: I am a file that you need me ;)</span>
</code></pre>
<p>if the code above sounds familiar to you, I have great news for you, Go offers a better, nice, and shorter way to do that. There is a package called <a target="_blank" href="https://golang.org/pkg/embed/">embed</a> do all of it for you in a single line. All you need is just pass the file path in a directive above your variable</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    _ <span class="hljs-string">"embed"</span>
    <span class="hljs-string">"fmt"</span>
)

<span class="hljs-comment">//go:embed config.txt</span>
<span class="hljs-keyword">var</span> content <span class="hljs-keyword">string</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    fmt.Print(content)
}
<span class="hljs-comment">// output: I am a file that you need me ;)</span>
</code></pre>
<p>That's it! you have the content of the file in the content variable! do you need to have it in <code>[]byte</code>?  you just need to change your variable type.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    _ <span class="hljs-string">"embed"</span>
    <span class="hljs-string">"fmt"</span>
)

<span class="hljs-comment">//go:embed config.txt</span>
<span class="hljs-keyword">var</span> content []<span class="hljs-keyword">byte</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    fmt.Print(content)
}
<span class="hljs-comment">// output: [73 32 97 109 32 97 32 102 105 108 101 32 116 104 97 116 32 121 111 117 32 110 101 101 100 32 109 101 32 59 41]</span>
</code></pre>
]]></content:encoded></item><item><title><![CDATA[Error handling in goroutines with errgroup]]></title><description><![CDATA[Golang made life easier with goroutine, however, sometimes it's difficult to handle errors that happened inside a goroutine effectively. For example, imagine you have an array of some kind of actions and wanted to run a specific function on each one ...]]></description><link>https://devmarkpro.com/error-handling-goroutines-errgroup</link><guid isPermaLink="true">https://devmarkpro.com/error-handling-goroutines-errgroup</guid><category><![CDATA[Go Language]]></category><category><![CDATA[error handling]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Mon, 11 Jan 2021 16:53:52 GMT</pubDate><content:encoded><![CDATA[<p><a target="_blank" href="https://golang.org/">Golang</a> made life easier with <strong>goroutine</strong>, however, sometimes it's difficult to handle errors that happened inside a goroutine effectively. For example, imagine you have an array of some kind of actions and wanted to run a specific function on each one of them. On the other hand, in case of an error, you want to propagate that error to the higher function.</p>
<p>Let's explain it in the code, Imagine that we have a set of <code>runners</code> and each runner has a <code>Handle</code> function.</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> HandlerFunc <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span>
<span class="hljs-keyword">type</span> Runner <span class="hljs-keyword">struct</span> {
    Name   <span class="hljs-keyword">string</span>
    Handle HandlerFunc
}
</code></pre>
<p>I also like to define another type named <code>Runners</code>. It's just a simple wrapper around arrays of runners</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> Runners []Runner
</code></pre>
<p>hence I can define a function that runs through all runners like this:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r Runners)</span> <span class="hljs-title">Execute</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">for</span> _, runner := <span class="hljs-keyword">range</span> r {
        <span class="hljs-keyword">if</span> err := runner.Handle(runner.Name); err != <span class="hljs-literal">nil</span> {
            <span class="hljs-keyword">return</span> err
        }
    }
    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}
</code></pre>
<p>and finally, define some runners and execute them:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    runners := Runners{
        Runner{
            Name: <span class="hljs-string">"1"</span>,
            Handle: <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span> {
                fmt.Printf(<span class="hljs-string">"runner %s is running\n"</span>, input)
                <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
            },
        },
        Runner{
            Name: <span class="hljs-string">"2"</span>,
            Handle: <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span> {
                <span class="hljs-keyword">return</span> fmt.Errorf(<span class="hljs-string">"something bad happened in runner [%s]"</span>, input)
            },
        },
        Runner{
            Name: <span class="hljs-string">"3"</span>,
            Handle: <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span> {
                fmt.Printf(<span class="hljs-string">"runner %s is running\n"</span>, input)
                <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
            },
        },
    }

    err := runners.Execute()
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        fmt.Printf(<span class="hljs-string">"execution failed: %v"</span>, err)
    }  
}
</code></pre>
<p>By running this piece of code I get this output: <code>runner 1 is running</code>. The problem is we didn't run the third runner. So in this case we print errors and continue running the Execution but we cannot propagate the error to the higher function!</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r Runners)</span> <span class="hljs-title">Execute</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">for</span> _, runner := <span class="hljs-keyword">range</span> r {
        <span class="hljs-keyword">if</span> err := runner.Handle(runner.Name); err != <span class="hljs-literal">nil</span> {
      fmt.Printf(<span class="hljs-string">"error happened in runner [%s]: %v"</span>, runner.Name, err)
        }
    }
    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Now, What if we want to run each runner in a different <strong>goroutine</strong> with the exact same scenario? Let's change a code a little bit to see if it works.</p>
<p>First thing first we add a <strong>go</strong> behind the function call. So our <code>Execute</code> function should be something like it:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r Runners)</span> <span class="hljs-title">Execute</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">for</span> _, runner := <span class="hljs-keyword">range</span> r {
        <span class="hljs-keyword">go</span> <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(runner Runner)</span> <span class="hljs-title">error</span></span> {
            <span class="hljs-keyword">if</span> err := runner.Handle(runner.Name); err != <span class="hljs-literal">nil</span> {
                <span class="hljs-keyword">return</span> err
            }
            <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
        }(runner)
    }
    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}
</code></pre>
<p>However, As you probably know if we run the program again, there will be nothing in output, because we never said the application to wait until the <strong>goroutines</strong> finish their work. for sake of simplicity, let's just add a <code>Sleep</code> to the <code>main</code> function:</p>
<pre><code class="lang-go">err := runners.Execute()
time.Sleep(<span class="hljs-number">3</span> * time.Second)
<span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
    fmt.Printf(<span class="hljs-string">"execution failed: %v"</span>, err)
}
</code></pre>
<p>Now, let's run it again. this time the output should be something like this:</p>
<pre><code class="lang-bash">runner 1 is running
runner 3 is running
</code></pre>
<p>OK, it's not OK! The execution worked well because we could execute runners 1 and 3, however, we still didn't do anything about the error.</p>
<h3 id="welcome-to-errgroup">Welcome to errgroup</h3>
<p>Now, it's time to solve the problem with <a target="_blank" href="https://godoc.org/golang.org/x/sync/errgroup">errgroup</a>. It's <strong>REALLY</strong> simple and easy. It works like <a target="_blank" href="https://gobyexample.com/waitgroups">waitgroups</a> under the sync package. Honestly, it's using wait groups behind the scene but since the mentioned scenario is quite common, <strong>errgroup</strong> make life easier for us!</p>
<p>If you're familiar with wait group, You should know what <code>wg.Done()</code> and <code>wg.Wait()</code> mean. <code>errgroup</code> offeres same thing. Let's make things clear in code. First, we declare an <code>g</code> variable</p>
<pre><code class="lang-go">g := <span class="hljs-built_in">new</span>(errgroup.Group)
</code></pre>
<p>and run our execute function inside a <strong>goroutine</strong> with <code>Go</code> func. so our <code>Execute</code> function turns to this:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r Runners)</span> <span class="hljs-title">Execute</span><span class="hljs-params">()</span></span> {
    <span class="hljs-keyword">for</span> _, runner := <span class="hljs-keyword">range</span> r {
        rx := runner
        g.Go(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
            <span class="hljs-keyword">return</span> rx.Handle(rx.Name)
        })
    }
}
</code></pre>
<p>the <code>Go</code> function gives and <code>func</code> which returns an error, if this error is not <code>nil</code> you will have that in the returns of the <code>Wait</code> function. </p>
<p>Here's the <code>Wait</code> func:</p>
<pre><code class="lang-go">runners.Execute()
err := g.Wait()
<span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
    fmt.Printf(<span class="hljs-string">"execution failed: %v"</span>, err)
}
</code></pre>
<p>As you see, we don't need <code>time.Sleep()</code> anymore, because the Wait func, waits until the last <strong>goroutine</strong> gets the result and returns the <strong>first</strong> error that happened. If all functions run without error, it returns <code>nil</code></p>
<p>the output is something like it:</p>
<pre><code class="lang-bash">runner 3 is running
runner 1 is running
execution failed: something bad happened <span class="hljs-keyword">in</span> runner [2]
</code></pre>
<p>and here's the complete code:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"fmt"</span>

    <span class="hljs-string">"golang.org/x/sync/errgroup"</span>
)

<span class="hljs-keyword">var</span> g errgroup.Group

<span class="hljs-keyword">type</span> HandlerFunc <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span>

<span class="hljs-keyword">type</span> Runner <span class="hljs-keyword">struct</span> {
    Name   <span class="hljs-keyword">string</span>
    Handle HandlerFunc
}

<span class="hljs-keyword">type</span> Runners []Runner

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r Runners)</span> <span class="hljs-title">Execute</span><span class="hljs-params">()</span></span> {

    <span class="hljs-keyword">for</span> _, runner := <span class="hljs-keyword">range</span> r {
        rx := runner
        g.Go(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
            <span class="hljs-keyword">return</span> rx.Handle(rx.Name)
        })
    }
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    runners := Runners{
        Runner{
            Name: <span class="hljs-string">"1"</span>,
            Handle: <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span> {
                fmt.Printf(<span class="hljs-string">"runner %s is running\n"</span>, input)
                <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
            },
        },
        Runner{
            Name: <span class="hljs-string">"2"</span>,
            Handle: <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span> {
                <span class="hljs-keyword">return</span> fmt.Errorf(<span class="hljs-string">"something bad happened in runner [%s]"</span>, input)
            },
        },
        Runner{
            Name: <span class="hljs-string">"3"</span>,
            Handle: <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(input <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">error</span></span> {
                fmt.Printf(<span class="hljs-string">"runner %s is running\n"</span>, input)
                <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
            },
        },
    }
    runners.Execute()
    err := g.Wait()
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        fmt.Printf(<span class="hljs-string">"execution failed: %v"</span>, err)
    }
}
</code></pre>
]]></content:encoded></item><item><title><![CDATA[chromedp: How to get the URL of the current page]]></title><description><![CDATA[In the previous article, we developed a simple application and open google.com with chromedp. in this article we are going to take a look and one of the chromedp functionalities to deal with the page and URL.
The way that the developers decided to us...]]></description><link>https://devmarkpro.com/chromedp-get-current-page-url</link><guid isPermaLink="true">https://devmarkpro.com/chromedp-get-current-page-url</guid><category><![CDATA[Go Language]]></category><category><![CDATA[Google Chrome]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Mon, 04 Jan 2021 19:05:16 GMT</pubDate><content:encoded><![CDATA[<p>In the <a target="_blank" href="https://devmarkpro.com/chromedp-get-started/">previous article</a>, we developed a simple application and open google.com with chromedp. in this article we are going to take a look and one of the chromedp functionalities to deal with the page and URL.</p>
<p>The way that the developers decided to use to do that is a very popular way along with the golang developers, Passing the reference of a variable to the function. I see the same pattern a lot in go libraries, for example, <a target="_blank" href="https://gorm.io/index.html">GORM</a> also using the same pattern.</p>
<p>In today’s scenario, we want to open the chromedp Github page and go to one of the files inside the repository and print the URL of the page. Simple and easy! so let’s get started!!!</p>
<p>For this purpose, we need to select the element that we want to click on it. Remember we just want to make things that we do manually, automatically. it means if we want to open the file named util_test.go, first, we need to go to the URL https://github.com/chromedp/chromedp then find the file inside the list and then click on it. For finding the elements inside the page we can simply use <a target="_blank" href="https://www.w3schools.com/cssref/css_selectors.asp">CSS Selectors</a>.</p>
<pre><code class="lang-go"><span class="hljs-comment">// our selector to find the file inside the page.</span>
selector := <span class="hljs-string">"a[title='util_test.go']"</span>
tasks := chromedp.Tasks{
  <span class="hljs-comment">// go to the page</span>
    chromedp.Navigate(<span class="hljs-string">"https://github.com/chromedp/chromedp"</span>),
  <span class="hljs-comment">// wait until the elemnt is visible and available inside the page</span>
    chromedp.WaitEnabled(selector),
  <span class="hljs-comment">// do a click on it</span>
    chromedp.Click(selector),
}
</code></pre>
<p>so we already go to the page that we wanted the only step that still remains is getting the page URL. For this purpose, we define a string variable and pass it to the Location function of chromedp.</p>
<pre><code class="lang-go"><span class="hljs-keyword">var</span> u <span class="hljs-keyword">string</span>
...
chromedp.Location(&amp;u),
</code></pre>
<p>and here's the complete code:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"fmt"</span>

    <span class="hljs-string">"github.com/chromedp/chromedp"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    <span class="hljs-keyword">var</span> u <span class="hljs-keyword">string</span>
    opts := <span class="hljs-built_in">append</span>(
        <span class="hljs-comment">// select all the elements after the third element</span>
        chromedp.DefaultExecAllocatorOptions[<span class="hljs-number">3</span>:],
        chromedp.NoFirstRun,
        chromedp.NoDefaultBrowserCheck,
    )
    <span class="hljs-comment">// create chromedp's context</span>
    parentCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
    <span class="hljs-keyword">defer</span> cancel()

    selector := <span class="hljs-string">"a[title='util_test.go']"</span>
    tasks := chromedp.Tasks{
        chromedp.Navigate(<span class="hljs-string">"https://github.com/chromedp/chromedp"</span>),
        chromedp.WaitEnabled(selector),
        chromedp.Click(selector),
        chromedp.Location(&amp;u),
    }
    ctx, cancel := chromedp.NewContext(parentCtx)
    <span class="hljs-keyword">defer</span> cancel()
    <span class="hljs-keyword">if</span> err := chromedp.Run(ctx, tasks); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-built_in">panic</span>(err)
    }
    fmt.Printf(<span class="hljs-string">"URL =&gt; %s\n"</span>, u)
}
</code></pre>
<p>the output is:</p>
<pre><code class="lang-bash">URL =&gt; https://github.com/chromedp/chromedp/blob/master/util_test.go
</code></pre>
]]></content:encoded></item><item><title><![CDATA[chromedp: Working with nodes and tabs]]></title><description><![CDATA[Today we are going to have some fun with chromedp. Today's scenario is gathering all the links on a page and loop through them. It should be simple, easy, and fun. let's get started.
As we discussed before, for doing this kind of actions in chromedp,...]]></description><link>https://devmarkpro.com/chromedp-working-with-nodes-and-tabs</link><guid isPermaLink="true">https://devmarkpro.com/chromedp-working-with-nodes-and-tabs</guid><category><![CDATA[Go Language]]></category><category><![CDATA[Google Chrome]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Sun, 03 Jan 2021 22:27:01 GMT</pubDate><content:encoded><![CDATA[<p>Today we are going to have some fun with <a target="_blank" href="https://github.com/chromedp/chromedp">chromedp</a>. Today's scenario is gathering all the links on a page and loop through them. It should be simple, easy, and fun. let's get started.</p>
<p>As we <a target="_blank" href="https://www.devmarkpro.com/chromedp-get-current-page-url/">discussed before</a>, for doing this kind of actions in chromedp, we need to pass the pointer of our variable to the corresponding function. </p>
<p>We wanted to go to the <a target="_blank" href="https://notepad-plus-plus.org/downloads/">Notpad++</a> website, print all links and open them in different tab. So first we need to define our pointer to the array of nodes.</p>
<pre><code class="lang-go"><span class="hljs-keyword">var</span> nodes []*cdp.Node
</code></pre>
<p>and then fill it with the <code>Nodes</code> function.</p>
<pre><code class="lang-go">selector := <span class="hljs-string">"#main ul li a"</span>
pageURL := <span class="hljs-string">"https://notepad-plus-plus.org/downloads/"</span>
chromedp.Run(ctx, chromedp.Tasks{
    chromedp.Navigate(pageURL),
    chromedp.WaitReady(selector),
    chromedp.Nodes(selector, &amp;nodes),
})
</code></pre>
<p>so let's take a look at the nodes and see what we have on them.</p>
<pre><code class="lang-go"><span class="hljs-keyword">for</span> _, n := <span class="hljs-keyword">range</span> nodes {
  u := n.AttributeValue(<span class="hljs-string">"href"</span>)
    fmt.Printf(<span class="hljs-string">"node: %s | href = %s\n"</span>, n.LocalName, u)
}
</code></pre>
<p>the output is:</p>
<pre><code class="lang-bash">node: a | href = https://notepad-plus-plus.org/downloads/v7.9.2/
node: a | href = https://notepad-plus-plus.org/downloads/v7.9.1/
node: a | href = https://notepad-plus-plus.org/downloads/v7.9/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.9/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.8/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.7/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.6/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.5/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.4/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.3/
node: a | href = https://notepad-plus-plus.org/downloads/v7.8.2/
...
</code></pre>
<p>so we grab all links, let's store them in an array, and then open them in a new tab. for opening a new tab in chromedp we need to create a new context from the context that we already have. We have a function name NewContext that does it for us.</p>
<pre><code class="lang-go">clone, cancel := chromedp.NewContext(ctx)
<span class="hljs-keyword">defer</span> cancel()
</code></pre>
<p>from now on, the clone context will do everything in a new tab. the rest of the process is exactly the same so we can easily run chromedp tasks in our new context.</p>
<pre><code class="lang-go"><span class="hljs-keyword">for</span> _, n := <span class="hljs-keyword">range</span> nodes {
    u := n.AttributeValue(<span class="hljs-string">"href"</span>)
    clone, cancel := chromedp.NewContext(ctx)
    <span class="hljs-keyword">defer</span> cancel()
    chromedp.Run(clone, chromedp.Navigate(u))
}
</code></pre>
<p>if you <a target="_blank" href="https://devmarkpro.com/posts/chromedp-get-started/">disable headless mode</a> and run the project, you see that URLs are opening one by one in a new tab. however, we can open all of them in a manner of second by using goroutine</p>
<pre><code class="lang-go">f := <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(ctx context.Context, url <span class="hljs-keyword">string</span>)</span></span> {
    clone, cancel := chromedp.NewContext(ctx)
    <span class="hljs-keyword">defer</span> cancel()
    chromedp.Run(clone, chromedp.Navigate(url))
}
<span class="hljs-keyword">for</span> _, n := <span class="hljs-keyword">range</span> nodes {
    u := n.AttributeValue(<span class="hljs-string">"href"</span>)
    <span class="hljs-keyword">go</span> f(ctx, u)
}
</code></pre>
<p>that's it, and here's the complete code:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"fmt"</span>
    <span class="hljs-string">"log"</span>

    <span class="hljs-string">"github.com/chromedp/cdproto/cdp"</span>
    <span class="hljs-string">"github.com/chromedp/chromedp"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    ctx, cancel := chromedp.NewContext(context.Background(), chromedp.WithErrorf(log.Printf))
    <span class="hljs-keyword">defer</span> cancel()
    <span class="hljs-keyword">var</span> nodes []*cdp.Node
    selector := <span class="hljs-string">"#main ul li a"</span>
    pageURL := <span class="hljs-string">"https://notepad-plus-plus.org/downloads/"</span>
    <span class="hljs-keyword">if</span> err := chromedp.Run(ctx, chromedp.Tasks{
        chromedp.Navigate(pageURL),
        chromedp.WaitReady(selector),
        chromedp.Nodes(selector, &amp;nodes),
    }); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-built_in">panic</span>(err)
    }
    f := <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(ctx context.Context, url <span class="hljs-keyword">string</span>)</span></span> {
        clone, cancel := chromedp.NewContext(ctx)
        <span class="hljs-keyword">defer</span> cancel()
        fmt.Printf(<span class="hljs-string">"%s is opening in a new tab\n"</span>, url)

        <span class="hljs-keyword">if</span> err := chromedp.Run(clone, chromedp.Navigate(url)); err != <span class="hljs-literal">nil</span> {
            <span class="hljs-comment">// do something nice with you errors!</span>
            <span class="hljs-built_in">panic</span>(err)
        }
    }
    <span class="hljs-keyword">for</span> _, n := <span class="hljs-keyword">range</span> nodes {
        u := n.AttributeValue(<span class="hljs-string">"href"</span>)
        <span class="hljs-keyword">go</span> f(ctx, u)
    }
}
</code></pre>
]]></content:encoded></item><item><title><![CDATA[chromedp: Get started to surfing the web with Go]]></title><description><![CDATA[chromedp helps you to flip through! the web fast, easy, and programmatically in Golang. If you have ever worked with Selenium or PhantomJS or other similar tools, this concept is familiar to you. 
In this article we are going to get start working wit...]]></description><link>https://devmarkpro.com/chromedp-get-started</link><guid isPermaLink="true">https://devmarkpro.com/chromedp-get-started</guid><category><![CDATA[Go Language]]></category><category><![CDATA[Google Chrome]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Sat, 02 Jan 2021 18:30:12 GMT</pubDate><content:encoded><![CDATA[<p><a target="_blank" href="https://github.com/chromedp/chromedp">chromedp</a> helps you to flip through! the web <strong>fast</strong>, <em>easy</em>, and <strong><em>programmatically</em></strong> in Golang. If you have ever worked with <a target="_blank" href="https://www.selenium.dev/">Selenium</a> or <a target="_blank" href="https://phantomjs.org/">PhantomJS</a> or other similar tools, this concept is familiar to you. </p>
<p>In this article we are going to get start working with <a target="_blank" href="https://github.com/chromedp/chromedp">chromedp</a> and do some simple tasks with it. Let's get started.</p>
<p>First thing first you have to add chromedp to your project's dependency. It can done just like other <code>golang</code> packages that you used before:</p>
<pre><code class="lang-bash">go get -u github.com/chromedp/chromedp
</code></pre>
<p>for starting work with chromedp, you do not need to do a lot! the only thing that you need to do is creating a <code>context</code> and start working with it;</p>
<pre><code class="lang-go">ctx, cancel := chromedp.NewContext(context.Background())
<span class="hljs-keyword">defer</span> cancel()
chromedp.Run(ctx, chromedp.Navigate(<span class="hljs-string">"https://www.google.com"</span>))
</code></pre>
<p>That's it. Trust me, chromedp just open google even if you don't see anything yet! the reason that you didn't see the browser is by default it runs in headless mode and as I expected we can change the behavior. Actually, there are a list of default behaviors that passed to the library by default</p>
<pre><code class="lang-go"><span class="hljs-comment">// DefaultExecAllocatorOptions are the ExecAllocator options used by NewContext</span>
<span class="hljs-comment">// if the given parent context doesn't have an allocator set up. Do not modify</span>
<span class="hljs-comment">// this global; instead, use NewExecAllocator. See ExampleExecAllocator.</span>
<span class="hljs-keyword">var</span> DefaultExecAllocatorOptions = [...]ExecAllocatorOption{
    NoFirstRun,
    NoDefaultBrowserCheck,
    Headless,

    <span class="hljs-comment">// After Puppeteer's default behavior.</span>
    Flag(<span class="hljs-string">"disable-background-networking"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"enable-features"</span>, <span class="hljs-string">"NetworkService,NetworkServiceInProcess"</span>),
    Flag(<span class="hljs-string">"disable-background-timer-throttling"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-backgrounding-occluded-windows"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-breakpad"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-client-side-phishing-detection"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-default-apps"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-dev-shm-usage"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-extensions"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-features"</span>, <span class="hljs-string">"site-per-process,TranslateUI,BlinkGenPropertyTrees"</span>),
    Flag(<span class="hljs-string">"disable-hang-monitor"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-ipc-flooding-protection"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-popup-blocking"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-prompt-on-repost"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-renderer-backgrounding"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"disable-sync"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"force-color-profile"</span>, <span class="hljs-string">"srgb"</span>),
    Flag(<span class="hljs-string">"metrics-recording-only"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"safebrowsing-disable-auto-update"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"enable-automation"</span>, <span class="hljs-literal">true</span>),
    Flag(<span class="hljs-string">"password-store"</span>, <span class="hljs-string">"basic"</span>),
    Flag(<span class="hljs-string">"use-mock-keychain"</span>, <span class="hljs-literal">true</span>),
}
</code></pre>
<p>As you can see, the third item in the array is Headless so we can modify it. Let's create our configs for creating a new context.</p>
<pre><code class="lang-go">opts := <span class="hljs-built_in">append</span>(
  <span class="hljs-comment">// select all the elements after the third element</span>
    chromedp.DefaultExecAllocatorOptions[<span class="hljs-number">3</span>:],
    chromedp.NoFirstRun,
    chromedp.NoDefaultBrowserCheck,
)
</code></pre>
<p>the latest step is to create an ExecAllocator and pass it as our parent context.</p>
<pre><code class="lang-go">parentCtx, _ := chromedp.NewExecAllocator(context.Background(), opts...)
ctx, _ := chromedp.NewContext(parentCtx)
</code></pre>
<p>Now if you run the project you will see the browser open and navigates to the google website.
here's the complete code:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"fmt"</span>

    <span class="hljs-string">"github.com/chromedp/chromedp"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    opts := <span class="hljs-built_in">append</span>(
        <span class="hljs-comment">// select all the elements after the third element</span>
        chromedp.DefaultExecAllocatorOptions[<span class="hljs-number">3</span>:],
        chromedp.NoFirstRun,
        chromedp.NoDefaultBrowserCheck,
    )
    <span class="hljs-comment">// create chromedp's context</span>
    parentCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
    <span class="hljs-keyword">defer</span> cancel()

    ctx, cancel := chromedp.NewContext(parentCtx)
    <span class="hljs-keyword">defer</span> cancel()

    <span class="hljs-keyword">if</span> err := chromedp.Run(ctx, chromedp.Navigate(<span class="hljs-string">"https://www.google.com"</span>)); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-built_in">panic</span>(err)
    }

    fmt.Println(<span class="hljs-string">"I've just saw Google!!!"</span>)
}
</code></pre>
]]></content:encoded></item><item><title><![CDATA[<bits/stdc++.h> in MacOS]]></title><description><![CDATA[The stdc++.h file is just a file with a bunch of include. the only thing that you have to do is to put this file in the default location of the c++ import files. Here's a quick way to add it in MacOS.
With XCode installed
create a file named stdc++.h...]]></description><link>https://devmarkpro.com/fix-stdcpp-h-in-macos</link><guid isPermaLink="true">https://devmarkpro.com/fix-stdcpp-h-in-macos</guid><category><![CDATA[C++]]></category><dc:creator><![CDATA[Mark Karamyar]]></dc:creator><pubDate>Thu, 31 Dec 2020 23:00:00 GMT</pubDate><content:encoded><![CDATA[<p>The <code>stdc++.h</code> file is just a file with a bunch of <code>include</code>. the only thing that you have to do is to put this file in the default location of the c++ import files. Here's a quick way to add it in MacOS.</p>
<h4 id="with-xcode-installed">With XCode installed</h4>
<p>create a file named <code>stdc++.h</code> in this path:  (you should create <code>bits</code> folder first)</p>
<p><code>/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/bits</code></p>
<h4 id="without-xcode">Without XCode</h4>
<p>create a file named <code>stdc++.h</code> in this path:  (you should create <code>bits</code> folder first)</p>
<p><code>/usr/local/include/bits</code></p>
<p>Now, simply fill the created file with the content below:</p>
<pre><code class="lang-cpp"><span class="hljs-comment">// C++ includes used for precompiling -*- C++ -*-</span>

<span class="hljs-comment">// Copyright (C) 2003-2013 Free Software Foundation, Inc.</span>
<span class="hljs-comment">//</span>
<span class="hljs-comment">// This file is part of the GNU ISO C++ Library.  This library is free</span>
<span class="hljs-comment">// software; you can redistribute it and/or modify it under the</span>
<span class="hljs-comment">// terms of the GNU General Public License as published by the</span>
<span class="hljs-comment">// Free Software Foundation; either version 3, or (at your option)</span>
<span class="hljs-comment">// any later version.</span>

<span class="hljs-comment">// This library is distributed in the hope that it will be useful,</span>
<span class="hljs-comment">// but WITHOUT ANY WARRANTY; without even the implied warranty of</span>
<span class="hljs-comment">// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the</span>
<span class="hljs-comment">// GNU General Public License for more details.</span>

<span class="hljs-comment">// Under Section 7 of GPL version 3, you are granted additional</span>
<span class="hljs-comment">// permissions described in the GCC Runtime Library Exception, version</span>
<span class="hljs-comment">// 3.1, as published by the Free Software Foundation.</span>

<span class="hljs-comment">// You should have received a copy of the GNU General Public License and</span>
<span class="hljs-comment">// a copy of the GCC Runtime Library Exception along with this program;</span>
<span class="hljs-comment">// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see</span>
<span class="hljs-comment">// &lt;http://www.gnu.org/licenses/&gt;.</span>

<span class="hljs-comment">/** @file stdc++.h
 *  This is an implementation file for a precompiled header.
 */</span>

<span class="hljs-comment">// 17.4.1.2 Headers</span>

<span class="hljs-comment">// C</span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">ifndef</span> _GLIBCXX_NO_ASSERT</span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cassert&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">endif</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cctype&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cerrno&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cfloat&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;ciso646&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;climits&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;clocale&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cmath&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;csetjmp&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;csignal&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cstdarg&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cstddef&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cstdio&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cstdlib&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cstring&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;ctime&gt;</span></span>

<span class="hljs-meta">#<span class="hljs-meta-keyword">if</span> __cplusplus &gt;= 201103L</span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;ccomplex&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cfenv&gt;</span></span>
</code></pre>
<p>or you can download the file and put it the right folder:</p>
<pre><code class="lang-bash">mkdir /usr/<span class="hljs-built_in">local</span>/include/bits
curl https://gist.githubusercontent.com/devmarkpro/94d0b20a87bd0cc5f094a9b98336aa6d/raw/724c53e3210029f7bfd5dfa6254d124791547153/stdc++.h &gt; /usr/<span class="hljs-built_in">local</span>/include/bits/stdc++.h
</code></pre>
<p>NOTE: You have to repeat these steps each time that you update the <code>XCode Command-line Tools</code> so if you know a permanent solution for this problem, please share it in the comment section.</p>
]]></content:encoded></item></channel></rss>