The Billion-Dollar Heist: How Bybit Survived the Largest Crypto Hack in History

On February 21, the cryptocurrency world was shaken when Bybit, one of the largest Bitcoin exchanges, fell victim to a staggering $1.5 billion hack – marking it as the biggest cyber heist in crypto history. Despite the massive breach, the platform continued operating, thanks in part to swift crisis management and the backing of industry heavyweights.

How the Hack Unfolded

On February 21, on-chain detective ZachXBT reported suspicious ETH outflows from Bybit. We are talking about 499,395 ETH (about $1.46 billion at the time). The assumptions about the hack were confirmed by the company’s CEO Ben Zhou, and his employees almost immediately published a statement according to which the incident occurred when transferring ETH from cold multisig storage to a hot wallet.

The attackers replaced the transaction signing interface so that all participants in the procedure saw the correct address. At the same time, the logic of the smart contract was changed, and the hackers gained control of the ETH wallet and withdrew all the funds.

Zhou hastened to reassure clients and emphasized that the platform remains solvent and continues to process withdrawal requests, albeit with a delay: within about 10 hours after the hack, the exchange recorded a record number of withdrawal requests – more than 350,000. At that time, about 2,100 requests remained pending, while 99.994% of transactions were completed.

Nevertheless, the platform’s CEO still asked partners to provide a loan in ETH – the funds were needed to cover liquidity during the crisis period. As a result, more than 10 companies supported the exchange.

Huobi co-founder Du Jun contributed 10,000 ETH and promised not to withdraw it for a month. The co-founders of Conflux and Mask Network also announced the deposit of Ether to the exchange’s cold wallets. Coinbase Head of Product Conor Grogan wrote that Binance and Bitget sent >50,000 ETH there too.

According to reporter Colin Wu, 12,652 stETH (around $33.75 million) were transferred from MEXC to Bybit’s cold wallet.

The ETH price responded to the Bybit hack by falling to $2,625 (Binance), but recovered fairly quickly. By the evening of February 23, the quotes momentarily exceeded $2,850, after which they corrected to $2,690 (as of February 24).

Bybit representatives said that information about the incident has been “reported to the relevant authorities.” In addition, cooperation with on-chain analytics providers has allowed them to identify and isolate the associated addresses, limiting the attackers’ ability to “withdraw ETH through legitimate markets.”

As of February 24, Bybit has fully restored its Ethereum reserves (~444,870 ETH).

Who Was Behind the Attack?

According to ZachXBT, unknown individuals quickly exchanged some of the stolen mETH and stETH tokens for ETH via decentralized exchanges. 10,000 ETH were divided between 36 wallets.

The founder of DeFi Llama, 0xngmi, noted that the methods in this attack are similar to the incident with the Indian exchange WazirX in July 2024. At that time, Elliptic analysts concluded that North Korean hackers were behind the attack.

0xngmi’s assumption was confirmed by Arkham Intelligence. According to them, on the day of the Bybit hack, ZachXBT investigator “provided irrefutable evidence of Lazarus Group’s involvement in the hack”:

Its analysis contains a detailed analysis of test transactions and associated wallets used before the attack, as well as a number of graphs and timestamps. This data has been transferred to the exchange team to assist with the investigation.”

The founder of the AML service BitOK and crypto investor Dmitry Machikhin noted that the stolen cryptocurrency is actively being withdrawn from the Ethereum network to other blockchains. According to his observations, immediately after the hack, the assets were distributed to 48 different addresses.

At the second stage:

  • crypto assets from these addresses were gradually split into even smaller parts (50 ETH each);
  • funds were sent through bridges (eXch and Chainflip) to other networks.

The image shows how one of the 48 addresses splits the transactions into 50 ETH and goes to Chainflip.

According to Taproot Wizards co-founder Eric Wall, the North Korean hackers are likely to convert all ERC-20 tokens to ETH, then exchange the resulting ETH for BTC, and then gradually transfer the bitcoins to yuan through Asian exchanges. In his opinion, the process could take years.

ZachXBT reported that Lazarus transferred 5,000 ETH to a new address and began laundering the funds through the centralized mixer eXch, and then transferred them to bitcoin through Chainflip. The latter said that they have recorded attempts by the attackers to withdraw the stolen funds from Bybit in bitcoin through their platform. They disabled some front-end services, but it is impossible to completely stop the protocol, given its decentralized structure with 150 nodes.

The mETH Protocol team reported that they blocked the withdrawal of 15,000 cmETH (~$43.5 million) and redirected the assets from the attacker’s address to a recovery account. Tether CEO Paolo Ardoino said that the company froze 181,000 USDT related to the attack.

In a comment to ForkLog, Bitget CEO Gracie Chen emphasized that “the exchange’s systems have already blacklisted the attackers’ wallets.”

As of February 23, the attackers had exchanged 37,900 ETH (about $106 million) for bitcoin and other assets through Chainflip, THORChain, LiFi, DLN, and eXch. The hackers’ address still had 461,491 ETH of the 499,395 ETH stolen.

What to do?

After the hack, some community members started talking about rolling back the state of the Ethereum network to return the stolen funds. Thus, former BitMEX CEO Arthur Hayes noted that as an investor with large ETH reserves, he would support the community’s decision in the event of a chain rollback to an earlier state – as after the hack of The DAO in 2016.

Bitcoin maximalist Samson Mow also spoke out in support of restoring the blockchain, but leading Ethereum developer Tim Beiko criticized the idea. According to him, the Bybit incident involved an incorrect presentation of transaction data in the hacked interface, and not technical problems.

In addition, after the hack, the funds quickly spread across the complex ecosystem of the second-largest cryptocurrency by capitalization. “Rolling back” the network would mean canceling many legitimate transactions, some of which are related to actions outside the Ethereum network. The Vice President of Yuga Labs, nicknamed Quit, also drew attention to this. He added that many ordinary users would lose money, and the accounting systems of large players like Circle and Tether would collapse.

What’s the bottom line

The Bybit hack turned out to be the largest in the crypto industry so far. However, the head of Bitget did not find any reason to panic: according to her, the losses are equivalent to Bybit’s annual profit ($1.5 billion), and clients’ funds are completely safe.

The incident did not affect market sentiment either. According to Glassnode, the implied volatility of the first cryptocurrency is close to record lows. Price fluctuations against the backdrop of the hacker attack decreased after Strategy founder Michael Saylor published a chart of the company’s coin purchases.

This time, there was no platform crash or market panic, and a quick response and community participation helped restore liquidity and partially block the stolen assets. However, the incident highlighted a persistent problem – even large centralized platforms are still susceptible to attacks and vulnerable to hackers.

Grok Names Elon Musk as the Main Disinformer

Elon Musk is the main disseminator of disinformation in X, according to the AI ​​assistant Grok from the entrepreneur’s startup xAI, integrated into his social network.

The billionaire has a huge audience and often spreads false information on various topics, the chatbot claims. Among other disinformers, according to the neural network: Donald Trump, Robert F. Kennedy Jr., Alex Jones and RT (Russian television).

Trump shares false claims about the election, Kennedy Jr. – about vaccines, and Alex Jones is known for spreading conspiracy theories. Russian television lies about political issues, Grok added.

Grok’s Top Disseminators of Disinformation. Data: X.

The chatbot cited Rolling Stone, The Guardian, NPR, and NewsGuard as sources of information.

The selection process involved analyzing multiple sources, including academic research, fact-checking organizations, and media reports, to identify those with significant influence and a history of spreading false or misleading information,” the AI ​​noted.

The criteria for compiling the rankings included the volume of false information spread, the number of followers, and mentions in credible reports.

When asked for clarification, Grok noted that the findings may be biased because the sources provided are mostly related to the funding or opinions of Democrats and liberals.

Recall that in January, artificial intelligence was used to spread fake news about the fires in Southern California.

A similar situation arose after Hurricane Helene.

Google Unveils Memory Feature for Gemini AI Chatbot

Google has launched a notable update to its Gemini AI chatbot, equipping it with the ability to remember details from previous conversations, a development experts are calling a major advancement.

In a blog post released on Thursday, Google detailed how this new capability allows Gemini to store information from earlier chats, provide summaries of past discussions, and craft responses tailored to what it has learned over time.

This upgrade eliminates the need for users to restate information they’ve already provided or sift through old messages to retrieve details. By drawing on prior interactions, Gemini can now deliver answers that are more relevant, cohesive, and enriched with additional context pulled from its memory. This results in smoother, more personalized exchanges that feel less fragmented and more like a continuous dialogue.

Rollout Plans and Broader Access
The memory feature is first being introduced to English-speaking users subscribed to Google One AI Premium, a $20 monthly plan offering enhanced AI tools. Google plans to extend this functionality to more languages in the near future and will soon bring it to business users via Google Workspace Business and Enterprise plans.

Tackling Privacy and User Control
While the ability to recall conversations offers convenience, it may raise eyebrows among those concerned about data privacy. To address this, Google has built in several options for users to oversee their chat data. Through the “My Activity” section in Gemini, individuals can view their stored conversations, remove specific entries, or decide how long data is kept. For those who prefer not to use the feature at all, it can be fully turned off, giving users complete authority over what the AI retains.

Google has also made it clear that it won’t use these stored chats to refine its AI models, putting to rest worries about data being repurposed.

The Race to Enhance AI Memory

Google isn’t alone in its efforts to boost chatbot memory. OpenAI’s Sam Altman has highlighted that better recall is a top demand from ChatGPT users. Over the last year, both companies have rolled out features letting their AIs remember things like a user’s favorite travel options, food preferences, or even their preferred tone of address. Until now, though, these memory tools have been fairly limited and didn’t automatically preserve entire conversation histories.

Gemini’s new recall ability marks a leap toward more fluid and insightful AI exchanges. By keeping track of past talks, it lets users pick up where they left off without losing the thread, proving especially handy for long-term tasks or recurring questions.

As this feature spreads to more users, Google underscores its commitment to transparency and control, ensuring people can easily manage, erase, or opt out of data retention altogether.

Sam Altman talks about the features of GPT-4.5 and GPT-5

OpenAI CEO Sam Altman shared the startup’s plans to release GPT-4.5 and GPT-5 models. The company aims to simplify its product offerings by making them more intuitive for users.

Altman acknowledged that the current product line has become too complex, and OpenAI is looking to change that.

We hate model selection as much as you do and want to get back to magical unified intelligence,” he wrote.

GPT-4.5, codenamed Orion, will be the startup’s last AI model without a “chain of reasoning” mechanism. The next step is to move toward more integrated solutions.

The company plans to combine the o and GPT series models, creating systems capable of:

  • using all available tools;
  • independently determining when deep thinking is needed and when an instant solution is enough;
  • adapting to a wide range of tasks.

GPT-5 integrates various technologies, including o3. Other innovations will include canvas capabilities (Canvas-mode), search, deep research (Deep Research) and much more.

Free GPT-5 subscribers will get unlimited access to the model’s tools on standard settings. Plus and Pro account holders will be able to use advanced features with a higher level of intelligence.

Regarding the release dates of GPT-4.5 and GPT-5, Altman wrote in the comments to the tweet about “weeks” and “months“, respectively.

According to Elon Musk, ChatGPT’s competitor, the Grok 3 chatbot, is in the final stages of development and will be released in one to two weeks. Reuters writes about this.

Grok 3 has very powerful reasoning capabilities, so in the tests we’ve done so far, Grok 3 outperforms all the models that we know of, so that’s a good sign,” the entrepreneur said during a speech at the World Summit of Governments in Dubai.

Recall that Altman turned down Musk and a group of investors’ bid to buy the non-profit that controls OpenAI for $97.4 billion. The startup’s CEO admitted that this was an attempt to “slow down” the competing project.

Unleashing Powerful Analytics: Harnessing Cassandra with Spark

Authored by Abhinav Jain, Senior Software Engineer

The adoption of Apache Cassandra and Apache Spark is a game-changer for organizations seeking to change their analytics capabilities in the modern world driven by data. With its decentralized architecture, Apache Cassandra is highly effective in dealing with huge amounts of data while ensuring low downtime. This occurs across different data centers which can be said as well for both fault tolerance and linear scalability: the reason why more than 1,500 companies — such as Netflix and Apple — deploy Cassandra. On the other hand, Apache Spark further boosts this system by processing data in memory, allowing speeds up to 100 times faster than disk-based systems and greatly enhancing the setup introduced by Cassandra.

A fusion of Cassandra and Spark results in not just a speedup, but an improvement of data analytics quality. The organizations that use this report drastically decrease their data processing time from hours to minutes — vital for finding insights quickly. This has brought them closer to staying ahead in the competitive markets since the two technologies work well together: When used jointly, Spark and Cassandra are best suited for real-time trend analysis.

On top of that, the integration of these two technologies is proposed as a response to the growing demand for flexible and scalable solutions in areas as broad as finance, where integrity, validity and speed play an important role. This coaction helps organizations not only control larger sets more expediently but also find valuable intelligence with a pragmatic approach: the decision is made based on their operation or the strategic move of their business. Given this, it becomes evident that knowledge about Cassandra’s integration with Spark should be part of every organization that intends to improve its operational analytical data.

Preface: Combining Cassandra’s Distribution with Spark’s In-Memory Processing

The use of Apache Cassandra has been a common choice for organizations that have large volumes of data to manage since they need distributed storage and handling capabilities. However, its decentralized architecture and tunable consistency levels — along with the ability to distribute large amounts of data across multiple nodes — is what makes it ideal without introducing minimal delays. In contrast, Apache Spark can work out processing and analyzing data in memory, which complements Cassandra as an outstanding partner able to deliver real-time analytics plus batch processing tasks.

Setting Up the Environment

To optimally prepare the environment for analytics using Cassandra and Spark, you start the process by installing Apache Cassandra first, then launching a Spark cluster. Both components need individual attention during configuration to promote harmony and achieve the best output from each side. The inclusion of connectors like DataStax Spark Cassandra Connector or Apache Spark Cassandra Connector is pivotal, since they help in effective data flow between Spark and Cassandra systems. Such connectors enhance query operation through Spark’s easy access to data from Cassandra without much network overhead due to parallelism optimization.

With the connectors having been configured, it’s equally vital that you tinker with the settings in a bespoke manner to cater to the workload specifics and volume of data. This could entail tweaking Cassandra’s compaction strategies and Spark’s memory management configurations — adjustments that must be made with anticipation of the incoming data load. The last leg of this journey is verifying the setup through test data: the successful integration signals workability, enabling a seamless analytics operation with due expectations. This setup — robust and intricate — acts as a fulcrum for both technologies, allowing them to be used at full capacity in one coherent analytics environment.

Performing Analytics with Spark and Cassandra

A fusion of Spark with Cassandra results in an enhancement of data processing: through the utilization of Spark’s efficient distribution model and Cassandra’s powerful computing capabilities. The end users are therefore able to perform advanced queries and deal with large datasets easily using Cassandara’s direct storage framework. In addition, these capabilities are enhanced by a number of libraries embedded within Spark, such as MLlib for machine learning, GraphX for graph processing, and Spark SQL for structured data handling — tools that support easy execution of complex transformations, and predictive analytics and data aggregation tasks. Furthermore, by caching data in memory, Spark speeds up iterative algorithms and queries, thus making it ideal where frequent data access is needed, coupled with manipulation via an intuitive user interface. The integration improves workflow and maintains high performance even after scaling to meet growing demands on big data across landscapes where large amounts prevail.

Real-time Analytics and Stream Processing

Furthermore, Spark plus Cassandra real-time analytics is a good approach to organizations’ intake and immediate analysis of data flows. This value is especially important for the business where speed and informativity are important. For example, monitoring of financial transactions, social network activity or IoT output information. Through Spark Streaming, data can be ingested in micro-batches and processed continuously with the possibility of implementing complex algorithms on the fly. When Spark is used with the CDC feature from Cassandra or tightly integrated with Apache Kafka as part of message queuing infrastructure, it turns into a powerful weapon that allows development teams to craft feedback-driven analytical solutions supporting dynamic decision processes which adapt towards changes unearthed from incoming data streams.

Machine Learning and Advanced Analytics

In addition to traditional analytics tasks, Spark opens up possibilities for advanced analytics and machine learning with Cassandra data. Users can create and model machine learning from Cassandra-stored data without having to move or duplicate it, hence enabling predictive analytics and anomaly detection as well as other high-end use cases through the adoption of Spark’s MLlib plus ML packages.

Best Practices and Considerations

One must take into account the best practices when integrating Spark and Cassandra for advanced analytics so that their potential can be maximized effectively. To ensure this, it is important to modify the data model of Cassandra in a way that meets the query patterns, helping reduce read and write latencies. In addition, when using partition keys design, distribute data equally across nodes to prevent hotspots while also configuring Spark’s memory and core settings appropriately. This will help you avoid resource overcommitment and thus any unnecessary performance issues.

Moreover, monitoring of both Spark and Cassandra clusters should be maintained continuously. Make use of tools such as Apache Spark’s web UI and Cassandra’s nodetool that can help you with performance metrics which would lead to bottlenecks showing up in no time. You must put in place strict data governance policies; this involves carrying out regular audits and compliance checks, which would ensure data integrity and security. Ensure secure access to data using authentication plus encryption (both in transit and at rest) that prevents unauthorized access and breaches.

Conclusion

Combining Apache Cassandra and Apache Spark creates a significant platform for large-scale analytics: it helps organizations get valuable and meaningful data much quicker than they ever did. By taking advantage of what each technology does best, companies have the opportunity to stay ahead of the competition, foster innovation, and ensure their decisions are based on quality data. Be it historical data analysis, streaming data processing as it flows or constructing machine learning frameworks, Cassandra and Spark, when brought together, form an adaptable and expandable solution for all your analytical needs. 

The Application of AI towards Real Time Fraud Detection on Digital Payments

The growth and development of the internet coupled with advanced digital communication systems has greatly transformed the global economy, especially in the area of commerce. Fraud attempts, on the other hand, have become more diverse and sophisticated over time, costing businesses and financial institutions millions of dollars each year. Fraudster activities and techniques have evolved from unsophisticated detection processes to contemporary automated methods based on rules through intelligent systems. Currently, artificial intelligence (AI) assists in both controlling and combating fraud, offering help to advance the sector of finance technology (fintech). In this article, we will explain the mechanics of AI in digital payments fraud detection focusing on the technical aspects, a real case, and relevant comments for mid-level AI engineers, product managers, and other professionals in fintech.

The Increased Importance of Identifying Fraud In Real-Time

The volume and complexity of digital payments, which include credit card transactions, P2P app payments, A2A payments, and others, continue to rise. Between 2023 and 2028, Juniper Research estimates that the cost of online payment fraud will climb beyond $362 billion globally. Automated and social engineering attacks exploit weaknesses such as stolen credentials and synthetic identities, often attacking within moments. Outdated methods of fraud detection that depend upon static rules (‘flag transactions over $10,000’) are ineffective against these fast paced threats. Systems are overloaded and angry customers worsen the problem, all the while undetected fraud continues to sail through.

Thanks to AI. Now, everything is seconds away, (we’ll repeat) all because of AI. With machine learning, deep learning and real-time data processing, AI can evaluate large amounts of data, recognize patterns, adapt to changes, and detect anomalies, all in a matter of milliseconds. For professionals in fintech, this movement is both a chance and a challenge: build systems that are accurate, fast, and scalable all while reducing customer friction.

How AI-Fueled Real-Time Fraud Detection Works

AI-enhanced fraud detection is supported by three tiers: data, algorithms, and real-time execution. Let’s simplify this concept for a mid-level AI engineering or product management team. 

The Underlying Information: For any front line fraud detection system, a payment transaction generated in real-time must be coupled with rich and high-quality data. This means diverse data, which includes transaction histories, user behavior profile data, device fingerprints, IP geolocation, and external sources such as chatter from the dark web. For instance, a transaction attempted from a new device located in a foreign country can be flagged as suspicious, when it is combined with a user’s base spending patterns. AI systems pull this data through streaming services such as Apache Kafka, or even cloud-native solutions like AWS Kinesis, which promises low latency. Data engineers must be willing to collect clean basic structured datasets, because the system performs poorly when the data given is of poor granularity. This is a proven lesson learned many times in the past twenty years for me.

Algorithms: The realm of AI has brought super advanced machine learning models into the world of detecting fraudulent activities, and these models are the backbone of AI fraud detection. Models with supervised learning capabilities work with labeled datasets (e.g. “fraud” vs. “legitimate”) and are proficient in recognizing established fraud patterns. Due to their accuracy and interpretability, Random Forests, and Gradient Boosting Machines (GBMs) are among the most popular models. Unfortunately, fraud is evolving much faster than data can be labeled and this is where unsupervised learning comes in. Clustering algorithms DBSCAN or autoencoders do not need previous examples and can pull unusual transactions for review. For example, even in the absence of historical fraud signatures, the sudden spike in small, rapid transfers can be flagged as it might indicate money laundering. Detection is further improved by deep learning models, such as recurrent neural networks (RNNs), that observe time series data (e.g. transaction timestamp) for hidden patterns and relationships.

Execution In Real-Time: Time is of the essence with digital payments. The payment systems must make a decision to approve, decline, or escalate a transaction in less than 100 milliseconds. This is only achievable by using distributed computing frameworks such as Apache Spark’s batch processing and Flink’s stream real-time analysis processing. Scaling inference is done using GPU-accelerated hardware, e.g., millions of transactions per second through NVIDIA CUDA, allowing for easy handling of over a thousand transactions every second. Product managers should remember that latency trade-offs can be detrimental when the complexity of the model increases; a simpler logistic regression may be suitable for low-risk scenarios, while high-precision cases require complex neural networks.

Real-World Case Study: PayPal’s AI-Driven Fraud Detection

To illustrate AI’s impact, consider PayPal, a fintech giant processing over 22 billion transactions annually. In the early 2010s, PayPal faced escalating payment fraud, including account takeovers and stolen card usage. Traditional rule-based systems flagged too many false positives, alienating users, while missing sophisticated attacks. By 2015, PayPal had fully embraced AI, integrating real-time ML models to combat fraud – a strategy we’ve seen replicated across the industry.

PayPal’s approach combines supervised and unsupervised learning. Supervised models analyze historical transaction data—device IDs, IP addresses, email patterns, and purchase amounts—to assign fraud probability scores. Unsupervised models detect anomalies, such as multiple login attempts from disparate locations or unusual order sizes (e.g., shipping dozens of items to one address with different cards). Real-time data feeds from user interactions and external sources (e.g., compromised credential lists) enhance these models’ accuracy.

Numbers: According to PayPal’s public reports and industry analyses, their AI system reduced fraud losses by 30% within two years of deployment, dropping fraud rates to below 0.32% of transaction volume—a benchmark in fintech. False positives fell by 25%, improving customer satisfaction, while chargeback rates declined by 15%. These gains stemmed from processing 80% of transactions in under 50 milliseconds, enabled by a hybrid cloud infrastructure and optimized ML pipelines. For AI engineers, PayPal’s use of ensemble models (combining decision trees and neural networks) offers a practical lesson in balancing precision and recall in high-stakes environments.

Technical Challenges and Solutions

Implementing AI for real-time fraud detection isn’t without hurdles. Here’s how to address them:

  • Data Privacy and Compliance: Regulations like GDPR and CCPA mandate strict data handling. Techniques like federated learning—training models locally on user devices – minimize exposure, while synthetic data generation (via GANs) augments training sets without compromising privacy.
  •  Model Drift: Fraud patterns shift, degrading model performance. Continuous retraining with online learning algorithms (e.g., stochastic gradient descent) keeps models current. Monitoring metrics like precision, recall, and F1-score ensures drift is caught early.
  •  Scalability: As transaction volumes grow, so must your system. Distributed architectures (e.g., Kubernetes clusters) and serverless computing (e.g., AWS Lambda) provide elastic scaling. Optimize inference with model pruning or quantization to reduce latency on commodity hardware.

The Future of AI in Fraud Detection

Whatever the future holds, it’s clear that AI’s role will only become more pronounced. For one, Generative AIs such as large language models (LLMs) could develop new methods of simulating fraud, while the involvement of blockchain technology could guarantee that the leger’s transaction records are safe from any possible modification. Identity verification through biometrics face detection and voice recognition will limit synthetic identity fraud.

As was noted previously, the speed, accuracy, and adaptability of AI in real-time fraud detection can enable users to effortlessly pinpoint and eliminate issues within digital payments that rule-based systems cannot alleviate. While PayPal’s success is evidence of this capability, the journey is not easy and requires fundamental discipline along with a well-planned approach. Now, for AI engineers, product managers, and fintech professionals, moving into this space is no longer purely a career change; it is an opportunity to build a safer financial system for all.

What is LLMOps, MLOps for large language models, and their purpose

Why manage transfer learning of large language models and what is included in this management: getting acquainted with the MLOps extension for LLM called LLMOps.

How did LLMOps come to be? 

Large language models, embodied in generative neural networks (ChatGPT and other analogues), have become the main technology of the outgoing year, which is already actively used in practice by both individuals and large companies. However, the process of training LLM (Large Language Model) and their implementation in industrial use must be managed in the same way as any other ML system. A good practice for this has become the MLOps concept, aimed at eliminating organizational and technological gaps between all participants in the development, deployment and operation of machine learning systems.

As the popularity of GPT networks and their implementation in various application solutions grows, there is a need to adapt the principles and technologies of MLOps to transfer learning used in generative models. This is because language models are becoming increasingly large and complex to maintain and manage manually, which increases costs and reduces productivity. To avoid this, LLMOps, a type of MLOps that oversees the LLM lifecycle from training to maintenance using innovative tools and methodologies, can help.

LLMOps focuses on the operational capabilities and infrastructure required to fine-tune existing base models and deploy these improved models as part of a product. Because base language models are huge, such as GPT-3, which has 175 billion parameters, they require a huge amount of data to train, as well as time to map the computations. For example, it would take over 350 years to train GPT-3 on a single NVIDIA Tesla V100 GPU. Therefore, an infrastructure that can run GPU machines in parallel and process huge data sets is essential. LLM inference is also much more resource-intensive than more traditional machine learning, as it is not a single model, but a chain of models.

LLMOps provides developers with the necessary tools and best practices for managing the LLM development lifecycle. While the ideas behind LLMOps are largely the same as MLOps, large base language models require new methods, guidelines, and tools. For example, Apache Spark in Databricks works great for traditional machine learning, but it is not suitable for fine-tuning LLMs.

LLMOps focuses specifically on fine-tuning base models, since modern LLMs are rarely trained entirely from scratch. Modern LLMs are typically consumed as a service, where a provider such as OpenAI, Google AI, etc. offers an API of the LLM hosted on their infrastructure as a service. However, there is also a custom LLM stack, a broad category of tools for fine-tuning and deploying custom solutions built on top of open-source GPT models. The fine-tuning process starts with an already trained base model, which then needs to be trained on a more specific and smaller dataset to create a custom model. Once this custom model is deployed, queries are sent and the corresponding completion information is returned. Monitoring and retraining a model is essential to ensure its consistent performance, especially for LLM-driven AI systems.

Rapid engineering tools allow contextual training to be performed faster and cheaper than fine-tuning, without requiring sensitive data. In this case, vector databases extract contextually relevant information for specific queries, and prompt queries can optimize and improve model output based on patterns and chaining.

Similarities and differences with MLOps

In summary, LLMOps facilitates the practical application of LLM by incorporating operational management, LLM chaining, monitoring, and observation techniques that are not typically found in conventional MLOps. In particular, prompts are the primary means by which humans interact with LLMs. However, formulating a precise query is not a one-time process, but is typically performed iteratively, over several attempts, to achieve a satisfactory result. LLMOps tools offer features to track and version prompts and their results. This facilitates the evaluation of the overall performance of the model, including operational work with multiple LLMs.

LLM chaining links multiple LLM invocations in a sequential manner to provide a single application function. In this workflow, the output of one LLM invocation serves as the input to another to produce the final result. This design approach represents an innovative approach to developing AI applications by breaking down complex tasks into smaller steps. Chaining removes the inherent limitation on the maximum number of tokens that LLM can process simultaneously. LLMOps simplifies chaining management and combines it with other document retrieval methods, such as vector database access.

LLMOps’s LLM monitoring system collects real-time data points after a model is deployed to detect degradation in its performance. Continuous, real-time monitoring allows you to quickly identify, troubleshoot, and resolve performance issues before they affect end users. Specifically, prompts, tokens and their length, processing time, inference latency, and user metadata are monitored. This allows you to notice overfitting or changing the underlying model before performance actually degrades.

Monitoring models for drift and bias is also critical. While drift is a common problem in traditional machine learning models, as we’ve written about here, monitoring LLM solutions with LLMOps is even more important due to their reliance on underlying models. Bias can arise from the original datasets on which the base model was trained, custom datasets used for fine-tuning, or even from human evaluators judging fast completion. A thorough evaluation and monitoring system is needed to effectively remove bias.

LLM is difficult to evaluate using traditional machine learning metrics because there is often no single “right” answer, whereas traditional MLOps relies on human feedback, incorporating it into testing, monitoring, and collecting data for use in future fine-tuning.

Finally, there are differences in the way LLMOps and MLOps approach application design and development. LLMOps is designed to be fast, whereas traditional MLOps projects are typically iterative, starting with existing proprietary or open-source models and ending with custom fine-tuned or fully trained models on curated data.

Despite these differences, LLMOps is still a subset of MLOps. That’s why the authors of The Big Book of MLOps from Databricks have included the term in the second edition of this collection, which provides guiding principles, design considerations, and reference architectures for MLOps.

Data Fabric and Data Mesh: Complementary Forces or Competing Paradigms?

As the world continues to change, two frameworks have emerged to help businesses each manage their data ecosystems – Data Fabric and Data Mesh. While both these frameworks aim to simplify a business’s data governance, integration, and access, they differ quite a lot in their philosophy and how they operate. Data Fabric focuses more on technological orchestration over a distributed environment. Alternatively, Data Mesh focuses more on structural decentralization and domain-centric autonomy. This article looks at the powerful cloud-based architecture that integrates these two frameworks through its definitions, strengths, limitations, and the potential for synergy.

What is Data Fabric?

The Data Fabric concept originated in 2015 and came into focus after Gartner included it in the top analysis trends of 2020. In the DAMA DMBOK2 glossary, data architecture is defined as the plan for how to manage an organization’s data assets in a way that model of the organization’s data structures. Data Fabric implements this by offering a unified framework that automatically and logically integrates multiple disjointed data systems into one entity. 

Simply put, Data Fabric is a singular architectural layer that sits on top of multiple heterogeneous data ecosystems – on-premises systems, cloud infrastructures, edge servers –  and abstracts their individual complexities. It uses and combines several data integration approaches like the use of special data access interfaces (APIs), reusable data pipelines, automation through metadata, and AI orchestration to provide and facilitate non-restricted access and processing. Unlike older methods of data virtualization, which assisted in constructing a logical view, Data Fabric combines with the essence of containerization, which allows better management, control, and governance making masking it more powerful for modernizing applications than traditional methods.

Key Features of Data Fabric

  • Centralized Integration Layer: A virtualized access layer unifies data silos, governed by a central authority enforcing enterprise standards.
  • Hybrid Multi-Cloud Support: Consistent data management across diverse environments, ensuring visibility, security, and analytics readiness.
  • Low-Code/No-Code Enablement: Platforms like the Arenadata Enterprise Data Platform or Cloudera Data Platform simplify implementation with user-friendly tools and prebuilt services.

Practical Example: Fraud Detection with Data Fabric

Consider a financial institution building a fraud detection system:

  1. An ETL pipeline extracts customer claims data from multiple sources (e.g., CRM, transaction logs).
  2. Data is centralized in a governed repository (e.g., a data lake on Hadoop or AWS S3).
  3. An API layer, enriched with business rules (e.g., anomaly detection logic), connects tables and exposes the unified dataset to downstream applications.


While this approach excels at technical integration, it often sidesteps critical organizational aspects – such as data ownership, trust, and governance processes—leading to potential bottlenecks in scalability and adoption.

How Data Mesh Works

Data Mesh, introduced around 2019, is a new framework of data architecture that puts a greater emphasis on people rather than technology and processes. Like DDD, Data Mesh advocates for Domain-oriented decentralization, which promotes the fragmentation of data ownership among business units. Unlike Data Fabric, which controls everything from a single point, Data Mesh assigns domain teams with the responsibility of treating data as a product that can be owned, accessed, and interacted with in a self-service manner. 

Core Principles of Data Mesh

  • Domain-Oriented Decentralization: The closest teams to the data, whether it be its consumption or generation, have the ownership and management of the data. 
  • Data as a Product: More than just a simple dataset, each dataset can be marketed and comes with features such as access controls and metadata. 
  • Self-Service Infrastructure: Centralized domain teams are able to function autonomously because of a centralized platform. 
  • Federated Governance: Domains without a central data governance point are controlled centrally in terms of standards, data policies, and interfacing.

Practical Example: Fraud Detection with Data Mesh

Using the same fraud detection scenario:

  1. A domain team (e.g., the claims processing unit) defines and owns an ETL/ELT job to ingest claims data.
  2.  Datasets (e.g., claims, transactions, customer profiles) are stored separately, each with a designated owner.
  3.  A data product owner aggregates these datasets, writing logic to join them into a cohesive fraud detection model, delivered via an API or event stream.

This approach fosters accountability and trust by embedding governance into the process from the outset. However, its reliance on decentralized teams can strain organizations lacking mature data cultures or robust tooling.

Emerging Tools

Data Mesh is still maturing technologically. Google’s BigLake, launched in 2022, exemplifies an early attempt to support Data Mesh principles by enabling domain-specific data lakes with unified governance across structured and unstructured data.

Data Fabric works best with complex siloed infrastructures since it offers a top-down approach to data access. On the other hand, Data Mesh performs well in decentralized organizations that are willing to undergo a cultural shift and give more emphasis on trust and agility as compared to technical standardization.

Just like data fabric and data mesh, enterprise operational context and digital transformation journey determines the scope of its existence. The cloud provides a platform where both approaches can be integrated. Consider an architecture where there exists an event bus (for example Apache Kafka), which streams data to many different consumers. The consumers could include AWS S3, which acts as a data lake, and ETL pipelines (AirFlow for batch and NiFi for streaming), which serve to integrate operational and historical data. Add a robust Master Data Management (MDM) layer and analytics will be of good quality. 

This is the integration point where synergy shines: the centralized integration of data fabric sets up the infrastructure and data mesh domain autonomy makes it possible to innovate. A cloud native application platform which enables and controls innovation is the result. Business Intelligence (BI) dashboard is an example, which could be drawing the Mesh IoT dashboard clean data products, while Fabric governs seamless access to data.

A Call to Innovate

Marrying these paradigms isn’t without hurdles. Architects and engineers must grapple with:

  • Migration Complexity: How do you transition on-premises data to the cloud without disruption?
  •  Real-Time vs. Batch: Can the platform balance speed and depth to meet business demands?
  •  Data Quality: How do you embed quality checks into a decentralized model?
  •  Security and Access: What federated security model ensures ease without compromising safety?
  •  Lifecycle Management: How do you govern data from creation to destruction in a hybrid setup?


Moreover, the cloud isn’t a silver bullet. Relational databases often fall short for advanced analytics compared to NoSQL, and data lake security models can hinder experimentation. Siloed data and duplication further complicate scalability, while shifting from centralized to decentralized governance requires a cultural leap.

The Verdict: Together, Not Versus

So, is it Data Fabric versus Data Mesh? These methods have no real conflict; they work hand in hand. Data Fabric shows the threads of a technology metaphor for a superordinate access to information, while Data Mesh gives authority to the operational teams to manage their data. In a cloud-powered ecosystem, they have the potential to revolutionize data management by merging centralization’s productivity with decentralization’s creativity. The challenge that arises is not what to select, but how to combine the multifarious assets into a harmonious orchestra that nurtures trust, economic agility, and value to the enterprise. As the instruments undergo changes and institutions transform, these two concepts may as well be the paradigm shift that data architecture has long been waiting for, shaken, stirred and beautifully blended.

UnlockED Hackathon

Shaping the Future of Education with Technology – February 25-26, 2024

ExpertStack proudly hosted the UnlockED Hackathon, a high-energy innovation marathon focused on transforming education through technology. With over 550 participants from the Netherlands and a distinguished panel of 15 judges from BigTech, the event brought together some of the brightest minds to tackle pressing challenges in EdTech.

The Challenge: Reimagining Education through Tech
Participants were challenged to develop groundbreaking solutions that leverage technology to make education more accessible, engaging, and effective. The hackathon explored critical areas such as:

  • AI-powered personalized learning – Enhancing student experiences with adaptive, data-driven education.
  • Gamification & immersive tech – Using AR/VR and interactive platforms to improve engagement.
  • Bridging the digital divide – Creating tools that ensure equal learning opportunities for all.
  • EdTech for skill-building – Solutions focused on upskilling and reskilling for the digital economy.

For 48 hours, teams brainstormed, designed, and built innovative prototypes, pushing the boundaries of what’s possible in education technology.

And the Winner is… Team OXY!
After an intense round of final presentations, Team OXY took home the top prize with their AI-driven adaptive learning platform that personalizes study plans based on real-time student performance. Their solution impressed judges with its scalability, real-world impact, and seamless integration with existing education systems.

Driving Change in EdTech
The UnlockED Hackathon was more than just a competition—it was a movement toward revolutionizing education through technology. By fostering collaboration between developers, educators, and industry leaders, ExpertStack is committed to shaping a future where learning is smarter, more inclusive, and driven by innovation.

Want to be part of our next hackathon? Stay connected and join us in shaping the future of tech! 🚀