Their team name might be Memory Leak, but the five Orange Coast College students who took home a title from the latest ASA DataFest at Chapman University won’t be forgotten anytime soon.
Together, they’re the first community college team to secure a win in any category of the competition since such institutions started participating last year.
ASA DataFest is a two-day “hackathon” that challenges teams of undergraduate students to clean, explore and analyze a large, corporate-provided data set. All work must be done onsite, though participants are free to come and go.
For last month’s competition, Memory Leak used Python, a popular coding language typically used for desktop and web applications, and created a two-slide presentation to show a panel of judges. The team won for Best Use of External Data, one of three award categories.
All the team members from the Costa Mesa campus — Naomi Valentin, Ryan Millett, Phuoc Do, Jacob Leenerts and Hector Elias — are computer science students. OCC computer science professor Nadia Ahmed served as their advisor.
The data provided to teams changes every year. This year’s set came from the Canadian women’s rugby team and included information on players’ wellness, their geographic locations, competitions and force load or impacts.
Elias said they gathered additional material from the National Center for Biotechnology Information, academic journals and research on traumatic brain injuries from the National Football League to compare against the corporate-provided data.
In doing so, the OCC team was able to “devise a means of counting up the number of collisions sustained by each player, which was corroborated with actual gameplay footage,” Ahmed said.
Ahmed said the data given to participants “is not clean … there’s a lot of missing values. There’s mismatches in the data. There might be inconsistent sensor readings that don’t make sense. So they have to completely sift through it, just like a data scientist would do from scratch.”
“What was interesting for us was we kept going back and forth and everything that kept sticking out was the sleep patterns,” Elias said. “I know that, having had brothers that were in the military, brain traumatic injuries are a severe issue. Everything we were looking at was subjective, but that was the only thing that kept coming up in everything we discussed.”
ASA DataFest began in 2011 at UCLA. Its inaugural year drew about 30 competitors who studied five years of arrest records provided by Lt. Thomas Zak of the Los Angeles Police Department.
Now, the American Statistical Assn. sponsors events annually at universities across the country between late March and early May. The competition at Chapman in Orange was one of two in Southern California.
Community colleges began competing only last year because “the logistics were just not prepared for such,” according to Ahmed.
“Based on the sign-up form for the competition, there was no community college category for the students to register under,” she said. “I had some back and forth with competition coordinators … [and] in the end, they were welcomed.”
Ahmed took a half-dozen teams to the competition last year: five from OCC, including Memory Leak, and one from Saddleback College in Mission Viejo.
Two other teams from OCC — Team Zero and Anonymous Dinosaurs — also competed this year under Ahmed’s mentorship. In all, the competition included 28 teams from across California, including Cal Poly Pomona, Chapman and UC Irvine.
Ahmed said ASA DataFest “was intense because the event is geared toward statisticians, so here we are, a bunch of computer scientists” — “computer science nerds,” Leenerts jokingly interjected — “that are going in and they’re interested in the data science because of machine learning.”
Though Do and Valentin are transferring at the end of this school year, Ahmed remains optimistic about the team’s future. She said its victory inspired creation of a new machine learning course that launches June 10 — the first time that class will be offered at the community college level.
That, combined with courses in Python and Ahmed’s continued work as advisor, will serve as training for those interested in competing next year.