OpenMALx

Repository for Infosec and Machine Learning Resources

OpenMALx is an organization focused on the development of datasets and models for security analysis. The project objective is to provide structured data for training and evaluating large language models in a security context.

Technical Focus

Dataset Formatting: Processing raw security tool logs into instruction/response pairs for model training.
Local Execution: Optimizing models for local hardware to ensure data remains on-premises.
Response Logic: Developing structured formats for explaining security vulnerabilities and remediation steps.

Active Projects

infosec-tool-output: A dataset mapping static and dynamic analysis tool outputs to technical summaries.
open-malsec: A collection of text-based security threats, including phishing and social engineering samples, for classification tasks.

Participation

This is an open-source project. Participation is accepted in the following areas:

Data Contribution: Providing anonymized security logs or threat samples.
Model Training: Testing and benchmarking model performance on security tasks.
Content Review: Verifying the technical accuracy of dataset summaries.