apache / spark
Apache Spark - A unified analytics engine for large-scale data processing
See what the GitHub community is most excited about today.
Apache Spark - A unified analytics engine for large-scale data processing
Rocket Chip Generator
Open-source high-performance RISC-V processor
The Scala 3 compiler, also known as Dotty.
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
深圳地铁大数据客流分析系统🚇🚄🌟
ZIO — A type-safe, composable library for async and concurrent programming in Scala
State of the Art Natural Language Processing
Build highly concurrent, distributed, and resilient message-driven applications on the JVM
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Chisel: A Modern Hardware Design Language
Source code for Twitter's Recommendation Algorithm
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Simple and Distributed Machine Learning
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
TheHive: a Scalable, Open Source and Free Security Incident Response Platform
A Spark plugin for reading and writing Excel files
Apache Spark Connector for SQL Server and Azure SQL