AutoCassandra: Automated Schema Generation Tool for Apache Cassandra Using Query-Driven Methodology

Harish Sharma1, Govind Kapoor1, Anita Singh1
1Vellore Institute of Technology, India
DOI: https://doi.org/10.71448/bcds2231-4
Published: 30/06/2022
Cite this article as: Harish Sharma, Govind Kapoor, Anita Singh. AutoCassandra: Automated Schema Generation Tool for Apache Cassandra Using Query-Driven Methodology. Bulletin of Computer and Data Sciences, Volume 3 Issue 1. Page: 34-53.

Abstract

Apache Cassandra’s column-oriented data model offers high scalability and performance for big data applications, but requires careful schema design that is heavily influenced by query patterns. Current Cassandra modeling remains a manual, expert-dependent process prone to errors and inefficiencies. This paper presents AutoCassandra, a novel tool that automates the generation of optimized Cassandra schemas from conceptual models and query workflows. Building on established mapping rules and patterns, AutoCassandra translates UML conceptual models and application queries directly into production-ready CQL schemas. Our evaluation demonstrates that AutoCassandra reduces schema design time by 68% while producing schemas that outperform manually designed equivalents by 23% in query execution time across diverse use cases. The tool represents a significant step toward making Cassandra’s performance benefits accessible to non-expert developers while ensuring best practices in schema design.

Keywords: NoSQL, Cassandra, automated schema generation, query-driven design, data modeling tool, big data, database optimization

Abstract

Apache Cassandra’s column-oriented data model offers high scalability and performance for big data applications, but requires careful schema design that is heavily influenced by query patterns. Current Cassandra modeling remains a manual, expert-dependent process prone to errors and inefficiencies. This paper presents AutoCassandra, a novel tool that automates the generation of optimized Cassandra schemas from conceptual models and query workflows. Building on established mapping rules and patterns, AutoCassandra translates UML conceptual models and application queries directly into production-ready CQL schemas. Our evaluation demonstrates that AutoCassandra reduces schema design time by 68% while producing schemas that outperform manually designed equivalents by 23% in query execution time across diverse use cases. The tool represents a significant step toward making Cassandra’s performance benefits accessible to non-expert developers while ensuring best practices in schema design.

Keywords: NoSQL, Cassandra, automated schema generation, query-driven design, data modeling tool, big data, database optimization
Harish Sharma
Vellore Institute of Technology, India
Govind Kapoor
Vellore Institute of Technology, India
Anita Singh
Vellore Institute of Technology, India

DOI

Cite this article as:

Harish Sharma, Govind Kapoor, Anita Singh. AutoCassandra: Automated Schema Generation Tool for Apache Cassandra Using Query-Driven Methodology. Bulletin of Computer and Data Sciences, Volume 3 Issue 1. Page: 34-53.

Publication history

Copyright © 2022 Harish Sharma, Govind Kapoor, Anita Singh. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Browse Advance Search