From CVE to Template: The Future of Automating Nuclei Templates with AI

Why We Did This
How We Did It
Fetching the Latest CVEs Using CVEmap
Filtering Relevant CVEs with the CVEmap API
Extracting POC Details Using AI
PDCP API to Generate Template
Reviewing and Refining the Template
Challenges We Faced
Future Improvements
Using ProjectDiscovery for Template Creation
Conclusion

Authors

Prince Chaddha

In the fast-paced world of cybersecurity, staying ahead of new vulnerabilities is a constant challenge. Every week, new vulnerabilities are disclosed, and attackers are quick to exploit them. As security teams, we need to be equally swift in identifying and addressing these vulnerabilities to protect our organizations and users.

Your ProjectDiscovery team desires to speed up the creation of Nuclei templates for new CVEs. Creating Nuclei templates manually for every new CVE is time-consuming and can lead to delays in detection. To address this, we explored how AI could automate the generation of Nuclei templates from newly disclosed CVEs.

In this blog post, I’ll walk you through how we set up an internal system to generate Nuclei templates using AI. We’ll cover our motivations, the steps we took, the challenges we faced, and the solutions we developed. Our goal is to share our experience openly, encourage collaboration, and inspire others in the cybersecurity community to explore similar approaches.

Before diving into the details, let me highlight two exciting initiatives for our community: the Templates Leaderboard, showcasing top contributors, and the Nuclei Templates Community Rewards Program, which recognizes and rewards impactful contributions. Learn more about these programs here.

Why We Did This

With the support of our community, our nuclei-templates project is healthy we're doing a great job. However, we'd love to speed the process to success and keeping up with every new CVE is tough. Automating the generation of Nuclei templates helps us ensure we don’t miss out on critical vulnerabilities and allows our team to focus on validating and refining templates rather than creating them from scratch.

We decided to harness AI’s power to automate the generation of Nuclei templates for the following reasons:

• Speed: Reduce the time between a CVE’s disclosure and the availability of a corresponding detection template.

• Coverage: Ensure a broader and more comprehensive set of vulnerabilities are addressed promptly.

• Efficiency: Free up our team to focus on reviewing and refining templates rather than creating them from scratch.

• Collaboration: Share our approach with the community to foster innovation and collective improvement.

How We Did It

Let me break down the steps we took to set up our AI-driven template generation system.

Fetching the Latest CVEs Using CVEmap

First, we needed to stay updated with the newest CVEs. We used CVEmap, our command-line tool that provides structured access to CVE databases.

To use CVEmap, you need an auth token, which can be obtained by signing up on the ProjectDiscovery. After signing up, navigate to your profile to find the token. Once you have the token, run the command cvemap -auth <your-token> to configure.

We can also use arguments to further filter out the results

cli

1cvemap -kev -template=false -poc -re

The above argument filters CVEs to show those marked as exploitable by CISA, with public proofs of concept, and remotely exploitable vulnerabilities, excluding CVEs with public Nuclei templates.

Similarly we can query a single CVE as well. We can use the following flags to directly fetch the json data:

Filtering Relevant CVEs with the CVEmap API

We wanted to automate fetching CVEs that are relevant and don’t already have a Nuclei template. To do this, we used the CVEmap API.

API Request:

cli

1curl -s "https://cve.projectdiscovery.io/api/v1/cves" \
2  -H "X-Pdcp-Key: <PDCP-AUTH-TOKEN>" \
3  --get \
4  --data-urlencode "published_at_gt=2024" \
5  --data-urlencode "is_poc=true" \
6  --data-urlencode "is_template=false" \
7  --data-urlencode "cvss_metrics.cvss31.vector_like=CVSS:3.1/AV:N/AC:L/PR:N/*"

This request fetches CVEs:

• Published after January 1, 2024.
• With a public POC.
• Without an existing Nuclei template.
• That are exploitable over the network without authentication.

Extracting POC Details Using AI

Next, we needed to extract the technical details from the POC urls to create a Nuclei template.

We automated the process of opening the POC URL in a headless browser and used ChatGPT to summarize the technical content. We used the following prompt:

cli

1Extract and preserve all technical details, including script codes, steps, POCs, endpoints, raw HTTP requests, and responses related to the identified vulnerabilities. Ensure the inclusion of specific details such as parameters, URLs, payloads, and response behaviors critical for understanding and replicating the issues. Avoid general commentary and non-technical content.\n\nTitle: {page_title}\nContext: {combined_content}

We faced challenges, such as difficulty in reading and summarizing content from random WordPress blogs, Medium posts, and similar references. We are still exploring better methods to crawl and extract such content effectively.

Example AI Response:

bash

1ID: CVE-2024-0235
2
3Description: The EventON WordPress plugin before 4.5.5, EventON WordPress plugin before 2.2.7 do not have authorisation in an AJAX action, allowing unauthenticated users to retrieve email addresses of any users on the blog
4Severity: medium
5
6--------------------
7PoC URLs:
8URL: https://wpscan.com/vulnerability/e370b99a-f485-42bd-96a3-60432a15a4e9/
9Content: Endpoint: https://example.com/wp-admin/admin-ajax.php?action=eventon_get_virtual_users
10
11Payload for administrator user emails:
12```
13_user_role=administrator
14```
15
16Payload for subscriber user emails:
17```
18_user_role=subscriber
19```
20
21Raw HTTP Request:
22```
23POST /wp-admin/admin-ajax.php?action=eventon_get_virtual_users HTTP/1.1
24Host: example.com
25Content-Type: application/x-www-form-urlencoded
26Content-Length: 20
27
28_user_role=administrator
29```
30
31Response Behavior:
32- The response will contain the email addresses of the users with the specified role (administrator or subscriber) in the blog.

PDCP API to Generate Template

After gathering all the necessary POC details, our next step was to generate the Nuclei template.

During this phase, we encountered several challenges related to the quality and functionality of the generated templates. These issues included YAML linting errors, incorrect mapping values, templates with random fields, improper structures that didn't align with the Nuclei engine, and weak matchers that caused inconsistencies in vulnerability detection.

To overcome these hurdles, we continuously refined our AI prompts and expanded our knowledge base with more example templates. This iterative process significantly reduced the number of non-functional issues, enhancing the reliability of the generated templates.

To further streamline and improve the template generation process, we developed an internal API called TemplateMan. TemplateMan is dedicated to handling non-functional enhancements, ensuring that each template meets our quality standards before being added to the repository. Here are some of the tasks that TemplateMan handles:

Updating Metadata fields including Shodan/fofa query fields.
Updating CVSS Classifications like cvss-metrics, cvss-score, epss-score.. etc.
Adding product and vendor details to the template.
Maintains uniformity in the template structure and formatting to align with our Nuclei-templates repository standards.

By automating these non-functional tasks, TemplateMan has significantly improved the quality and consistency of our templates, reducing the need for extensive manual corrections. All these features can be accessed through our PDCP API.

This is still a work in progress, and we are continuously refining and enhancing it through iterative improvements.

bash

1curl --request POST \
2  --url https://api.projectdiscovery.io/v1/template/ai \
3  --header 'Content-Type: application/json' \
4  --header 'X-API-Key: <PDCP-API-KEY>' \
5  --data '{
6    "prompt": "<EXTRACTED-POC-FROM-ABOVE-REQUEST>"
7  }'

Generated Template:

yaml

1id: CVE-2024-0235
2
3info:
4  name: EventON WordPress Plugin Unauthorized Email Access
5  author: pd-bot
6  severity: medium
7  description: The EventON WordPress plugin before 4.5.5, EventON WordPress plugin before 2.2.7 do not have authorization in an AJAX action, allowing unauthenticated users to retrieve email addresses of any users on the blog.
8  impact: |
9    An attacker could potentially access sensitive email information.
10  remediation: |
11    Update to the latest version of the EventON WordPress Plugin to mitigate CVE-2024-0235.
12  reference:
13    - https://wpscan.com/vulnerability/e370b99a-f485-42bd-96a3-60432a15a4e9/
14    - https://github.com/fkie-cad/nvd-json-data-feeds
15  classification:
16    cvss-metrics: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N
17    cvss-score: 5.3
18    cve-id: CVE-2024-0235
19    cwe-id: CWE-862
20    epss-score: 0.00052
21    epss-percentile: 0.19212
22    cpe: cpe:2.3:a:myeventon:eventon:*:*:*:*:*:wordpress:*:*
23  metadata:
24    vendor: myeventon
25    product: eventon
26    framework: wordpress
27    shodan-query: vuln:CVE-2023-2796
28    fofa-query: wp-content/plugins/eventon/
29  tags: wpscan
30
31http:
32  - method: POST
33    path:
34      - "{{BaseURL}}/wp-admin/admin-ajax.php?action=eventon_get_virtual_users"
35
36    headers:
37      Content-Type: application/x-www-form-urlencoded
38
39    body: "_user_role=administrator"
40    matchers:
41      - type: word
42        words:
43          - "@"
44        part: body

Free users can make up to 10 AI requests per day, with higher access for subscribed plans and unlimited usage for enterprise users. We’ll adjust these thresholds as we analyze usage patterns to better serve all users.

Reviewing and Refining the Template

We manually review the generated template to ensure its accuracy and refine it by adjusting the matchers to minimize false positives. Below is an example of the updated merged template, fully generated automatically using AI.

Using the above process, we have automated the workflow and now receive notifications about newly generated CVE templates directly on our Slack channel.

As a result of this process, we’ve published the nuclei-templates-ai project on GitHub, which hosts a collection of AI-generated templates. We plan to update this repository regularly with newly generated templates. We encourage the community to contribute by reviewing and refining these templates, and by submitting pull requests to the nuclei-templates repository. Through this collaborative effort, we aim to expand CVE coverage and strengthen internet-wide security.

AI Generated Nuclei Templates: https://github.com/projectdiscovery/nuclei-templates-ai

The templates are AI-generated and may contain errors, such as invalid matchers or incorrect request data. We strongly advise against using them without proper review. These templates are specifically intended for the community and our team to review, validate, and contribute to the nuclei-templates project.

We are currently running a program where we send stickers to all first-time contributors. This is a great opportunity to pick an AI-generated template, review it, set up a vulnerable environment, and submit a pull request with the debug data for our team to review on nuclei-templates. In addition to this, we plan to send swag to our top contributors each month.

Challenges We Faced

Difficulty in summarizing POC from Various Sources: We faced difficulties in accurately summarizing POC from diverse sources such as random WordPress blogs, Medium posts, and similar references. These sources often had inconsistent formatting and varying levels of technical detail, which made it challenging for AI to extract and summarize the necessary information effectively.
Weak Matchers: Sometimes, the generated templates had weak conditions that could lead to false positives. We addressed this by refining the AI prompts and adding manual review steps.
Inconsistent Data: CVE data occasionally had inaccuracies, such as misclassified vulnerabilities. We improved our filtering criteria and included manual verification.
Limited Scope for Authenticated CVEs: This approach works effectively for unauthenticated CVEs; however, it presents challenges for authenticated CVEs, where additional context and access requirements are needed. This limitation highlights the need for further enhancements to handle such scenarios effectively.

Future Improvements

We're excited about the potential to enhance this system further. Our planned improvements include:

Training Custom AI Models: Building AI models specifically tailored for security tasks could improve accuracy and handle complex cases more effectively.
Automated Testing: Integrating automated testing of generated templates to streamline validation and ensure they work as intended before deployment.
Support for Images and Attachments in our API: Enhancing the AI API to process vulnerability information directly from images and PDF reports. Instead of relying solely on text inputs, users can upload screenshots or PDF documents containing vulnerability details.
Community Collaboration: Opening the project to community contributions to gather diverse insights and expertise, further refining our approach.

Using ProjectDiscovery for Template Creation

To make automated template generation accessible, we’ve seamlessly integrated it into our cloud platform. Here’s how you can get started:

Sign Up: Visit ProjectDiscovery and create a free account.
Access the Template Editor: Log in and navigate to the Template Editor.
Start with AI: Click on the Start with AI button.
Input Vulnerability Details: Enter key details about the vulnerability, such as the CVE ID and relevant exploit information.
Generate Template: Click Generate, and let the AI create a tailored Nuclei template for you.
Review and Save: Carefully review the generated template, make any necessary adjustments, and save it to your account.

For detailed guidance, refer to our documentation.

Conclusion

By leveraging AI, we’ve significantly sped up the process of generating Nuclei templates for new CVEs. This approach helps us stay ahead in identifying vulnerabilities and provides the community with timely detection capabilities.

Despite the initial challenges, our iterative improvements has significantly enhanced the quality and consistency of our templates. Integrating this process into our cloud platform has made it accessible and easy to use for everyone in the community.Your feedback and contributions are invaluable in helping us improve and make cybersecurity more accessible.

Feel free to explore, contribute, and share your thoughts!

You can also join our Discord server. It's a great place to connect with fellow contributors and stay updated with the latest developments. Thank you, once again!

By leveraging Nuclei and actively engaging with the open-source community, or by becoming a part of the ProjectDiscovery Cloud Platform, companies can enhance their security measures, proactively address emerging threats, and establish a more secure digital landscape. Security represents a shared endeavor, and by collaborating, we can consistently adapt and confront the ever-evolving challenges posed by cyber threats.

Interested in ProjectDiscovery? Learn more here...

From CVE to Template: The Future of Automating Nuclei Templates with AI

Table of Contents

Authors

Prince Chaddha

Share

Why We Did This

How We Did It

Fetching the Latest CVEs Using CVEmap

Filtering Relevant CVEs with the CVEmap API

Extracting POC Details Using AI

PDCP API to Generate Template

Reviewing and Refining the Template

Challenges We Faced

Future Improvements

Using ProjectDiscovery for Template Creation

Conclusion

Get started for free

See ProjectDiscovery in action

Platform

Open Source

Resources

Company