Cassandra Summit

This event has passed. View the upcoming Data & Storage Events.

""

Workshops

We’d like to invite all registered attendees of Cassandra Summit + AI.dev to join us at the following workshops.

Unless otherwise noted, pre-registration is not required. We will do our best to accommodate all interested attendees, but please note that participation is on a first-come, first-served basis.

Training Workshop: Bring along your use case

Monday, December 11

Time: 1:00 – 5:00pm PST
Location: San Jose Marriott, Salons I/II, located on the 2nd Floor

Registration: Free, but pre-registration is required. You will automatically be registered for the Networking Reception of you have registered for the Training Workshop by adding it to your Cassandra Summit + AI.dev registration.

If you are not able to attend the Training Workshop and would like to attend the Networking Reception separately please RSVP to community@axonops.com

Speakers:
Jon Haddad – Rustyrazorblade Consulting
Jordan West – Netflix
Hayato Shimizu – AxonOps
Johnny Miller – AxonOps
Patrick McFadin – DataStax

Sponsored by
axonops logo

Sponsored by AxonOps, with support from some of the world’s leading Cassandra experts with experience at many of the world’s largest and most complex Cassandra environments. This 4-hour training workshop is going to be slide-free and highly interactive!

You will be asked to bring along your use case and the primary areas of interest and walk away with the answers and information you need.

The training will be split into 2 phases.

  1. Topics shortlist: Starting as a group session we’ll work through your top topics with contribution from our expert panel to tease out valuable information from the get-go. This is going to be approximately 1 hour.
  2. Deep dive breakouts: We’ll split into 4 groups across key Cassandra topics armed with specific areas to focus on. You’ll then change topics every 45 minutes. A Cassandra expert will host each group and with a whiteboard in hand dive deep into the primary areas of interest.

This will be a unique training experience where you’ll get a chance to engage your peers and interact with some of the leading Cassandra experts.

Log Processing with Apache NiFi and Apache Kafka

Tuesday, December 12

Time: 11:00am – 12:20pm PST
Location: TBA

Speaker: Sophia Izokun, Data Scientist – IBM

This session will be a technical workshop to demonstrate how to leverage open source tools (NiFi & Kafka) in real time log processing. The process of ingesting, indexing and assessing information from one or more events captured from a computing system is known as log analytics, most if not all “computers” and programmable software produce logs and the log dataset that will be used for this workshop is the NASA Kennedy Space Center Logs (from a web server located in Florida).

The workshop will cover the following:

  • How to set up the environment
  • Data transformation
  • Creating a Kafka Topic and NiFi Flow
  • Setting up dataflow to get data from NiFi to Kafka Topic
  • Loading data from Kafka topic into OpenSearch (OpenSearch is a perfect open source tool for log analytics)

Empowering Collaboration: AI Developer Experiences – Your Bridge from Model to Production

Tuesday, December 12

Time: 3:00pm – 4:10pm PST
Location: TBA

Speaker: Cedric Clyburn, Developer Advocate – Red Hat

In today’s rapidly evolving tech landscape, collaboration between data scientists and developers is more critical than ever. An open-source strategy for AI empowers organizations to harness the full potential of their data and technology stack. In this session we show how with a complete end-to-end story for Generative AI from model development to application deployment with open source tools such as Jupyter, Ray, Kserve, Kubernetes and Backstage, enabling seamless collaboration between data scientists and developers.

Join this session to discover how these tools bridge the gap from model to production, fostering a culture of MLOps and GenOps in your projects.

Define “Open AI”

Tuesday, December 12

Time: 4:20pm – 6:15pm PST
Location: TBA

Speakers:
Ruth Suehle EVP, Apache Software Foundation & Director of Open Source – SAS
Stefano Maffulli – Open Source Initiative
Mer Joyce – Do Big Good

As the legislators accelerate and the doomsayers chant, one thing is clear: It’s time to define what “open” means in the context of AI/ML before it’s defined for us.

Join this interactive session to share your thoughts on what it means for Artificial Intelligence and Machine Learning systems to be “open”. The Open Source Initiative to hear from the attendees what they think should be the shared set of principles that can recreate the permissionless, pragmatic and simplified collaboration for AI practitioners, similar to what the Open Source Definition has done for software.

We’ll share a draft of a new definition of “open” AI/ML systems and ask attendees to review it in real time.

Prompt Engineering with Watsonx

Wednesday, December 13

Time: 10:30am – 11:40am PST
Location: TBA

Speaker: Rafael Vasquez, Open Source Software Developer – IBM

Part art, part science, prompt engineering is the process of crafting input text to fine-tune a given large language model for best effect.

Foundation models have billions of parameters and are trained on terabytes of data to perform a variety of tasks, including text-, code-, or image generation, classification, conversation, and more. A subset known as large language models are used for text- and code-related tasks. When it comes to prompting these models, there isn’t just one right answer. There are multiple ways to prompt them for a successful result.

In this workshop, you will learn the basics of prompt engineering, from monitoring your token usage to balancing intelligence and security. You will be guided through a range of exercises where you will be able to utilize the different techniques, dials, and levers illustrated in order to get the output you desire from the model. Participants of this workshop will be equipped with a comprehensive understanding of prompt engineering along with the practical skills required to achieve the best results with large language models.

GenOps: Building a MLOps Platform to Support GenAI Workloads with Open-Source and Kubeflow

Wednesday, December 13

Time: 11:10am – 12:20pm PST
Location: TBA

Speaker: Farshad Ghodsian, Lead Consultant, Data & AI – Sourced Group

Taking a deep dive into how we have built an end-to-end MLOps platform on GKE (Google Kubernetes Engine) using Open-Sourced technologies like Kubeflow, MLFlow, Spark on Kubernetes, and other open-sourced tools and how we are using it to support Generative AI models (specifically LLMs) in the Cloud. Will also walkthrough some learnings, tips and a demo on how you can leverage the same open-sourced tooling to run your models.

Get Hands On with Amazon Keyspaces (for Apache Cassandra): A workshop for serverless developers

Wednesday, December 13

Time: 11:50AM – 1:30PM
Location: San Jose McEnery Convention Center

Registration: Free, but pre-registration is required. To register for the workshop, add it to your Cassandra Summit + AI.dev registration.

Sponsored by
AWS logo

In this 90 min builders’ session, acquire hands-on experience using Amazon Keyspaces, a serverless, fully managed, Apache Cassandra compatible database on AWS.

Start with an introduction to Amazon Keyspaces, and how to deploy a production grade multi-region table in seconds. Next, explore the advantages of Cassandra’s wide column model and its CQL API to deliver single digit millisecond performance at scale. Next, you will create modern architectures that do more with less by leveraging native integrations with AWS services. Next, learn best practices from principal solution architects, product leaders, and senior service engineers who build the Amazon Keyspaces service to maximize performance scalability in the cloud for your applications. Finally, learn how to use Amazon Keyspaces to scale AI/ML feature stores on AWS.

This presentation is brought to you by Amazon Keyspaces, and the AWS Open Source team. Lunch will be provided.

Agenda

  • 11:50 – Intro – Open source and Serverless
  • 12:00 – Production grade Amazon Keyspaces multi-region table in seconds
  • 12:15 – Adaptive Capacity – a purposeful scaling mechanism
  • 12:30 – Setting up serverless data pipelines using Amazon Keyspaces
  • 12:40 – Building a modern online feature store using Amazon Keyspaces
  • 1:10 – Conclusion Q&A
  • 1:20 – End

LLMs Fine Tuning and Inferencing Using ONNX Runtime

Wednesday, December 13

Time: 4:10pm – 5:20pm PST
Location: TBA

Speakers:
Kshama Pawar, Principal Program Manager – Microsoft
Sunghoon Choi, Principal Software Eng Manager – Microsoft
Abhishek Jindal, Software Engineer, AI Frameworks – Microsoft

We want to demonstrate an example for finetuning and inferencing a Large Language Model (LLM) using ONNX Runtime. ONNX Runtime is a cross-platform inference and training machine-learning accelerator, it enables easier and faster customer experiences with lower costs. We will walk through an end-to-end example of finetuning and inferencing on latest LLMs like LLAMA, Mistral, Zephyr, with a real-world application for the model and how users can leverage existing technologies for a quick and simple setup to get started.

The workshop will be run on AzureML using a user friendly docker environment called Azure Container for Pytorch (ACPT), which already includes the latest technologies validated for model training like Deepspeed, which is a well-adopted distributed-training library for training large deep learning models and LoRA (Low-Rank Adoption), a method to fine-tune LLMs on new data without training the entire model. We will also do a deep dive into what ONNX Runtime is, why it can execute code faster than the baseline and how portable it is across frameworks and devices.

Sponsors

Diamond

Platinum

Gold

Silver

Bronze

Community Partners

Apache Cassandra®, Cassandra and Apache® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse or review the materials provided at this event, which is managed by The Linux Foundation.