Expel AME reduces the time analysts spend triaging benign emails, by identifying and auto-closing emails related to marketing activities. This gives analysts more time to focus on triaging suspicious emails.
Marketing emails were specifically targeted for auto-identification because:
- They are largely recognizable to a human, so a machine can be taught what to look for.
- Expel receives a large volume of marketing emails; approximately 30% of our overall submissions are marketing emails.
What's Considered a Marketing Email?
Expel considers a variety of emails to be marketing email. Some examples include:
What Does Expel AME Output?
Expel AME returns 4 categories:
- Marketing
- Likely marketing
- Inconclusive
- Not marketing
Expel is conservative in what it confidently predicts as a marketing email, requiring above 95% probability.
What If I See an Email with ≥ 95% Marketing Probability in My Queue? Should It Be Auto-Closed?
In short, yes. But because of the configurations we have in place, and our quality control (QC) process, it's possible you see an email that'd normally be closed but wasn't.
Here are some additional details as to what scenarios have an email not auto-closed as marketing, even if it has a high probability of being marketing:
- The email came from a newly onboarded customer. For these customers, Expel may delay turning on Expel AME until we're certain how it performs for that specific customer.
-
The email didn't pass the Expel post-processing rules. Post-process rules are additional logic Expel adds before closing an email as marketing. These rules allow Expel to make sure that emails which may be masking as marketing emails are reviewed by analysts and not auto-closed. Some example post-processing rules are:
- A marketing email was sent from a young domain.
- Emails from a sender associated with previous malicious campaigns.
What Factors Are Considered When Deciding an Email Is Marketing?
The following is a mostly complete list of items the model uses to make a prediction of whether an email is marketing or not. The list was developed with our subject matter experts, our existing data sets, and the thousands of incidents we have addressed.
While this is a mostly complete list of the items we consider, we're constantly updating and tweaking it.
| Item | Description |
|---|---|
| Does the return path match the sender? | If the return path doesn't match the sender, the email is suspicious. |
| Is the sender domain associated with a previous malicious investigation? | If the sender domain is associated with a previous malicious investigation, this email is suspicious. |
| Is the subject tagged as marketing by upstream systems? | If an upstream system tagged this email as marketing, we incorporate this additional external knowledge. |
| Is the sender a corporate marketing account? | Professional marketing emails likely come from a marketing group within an organization. |
| Does the email contain an unsubscribe button? | Some marketing emails include an unsubscribe button. |
| Does the email have attachments? | Marketing emails are less likely to have attachments, instead opting for hyperlinks. |
| Does the email contain attachments that are common attack surfaces? | If the email contains attachments that are common attack surfaces, like spreadsheets or compressed files, it's more suspicious and less likely to be marketing. |
| Is the email sent from a personal email account? | Marketing emails are not likely to come from personal email accounts. |
| What Expel severity did the YARA rules decide on? | We include escalation logic derived from analyst experience, where High and Medium severity suggest that the email is suspicious. |
| Is the email sent from a corporation known to be abused by attackers? | PayPal, SharePoint, and DocuSign are platforms commonly abused by attackers, so emails from organizations can require additional oversight. |
| How many domains are linked to in the email? | A marketing email can reference more unique domains because these emails reference external content. |
| How many URLs are linked to in the email? | A marketing email can reference more unique URLs because these emails reference external content and often contain unique identifiers. |
| How many marketing-related terms are in the email body? | Marketing phrases and products (such as webinars, white papers) are clearly advertised in the subject line. |
| How many marketing-related terms are in the subject? | Marketing phrases and products (such as webinars, white papers) are clearly advertised in the subject line. |
| How colorful is the email screenshot? | Some types of marketing emails can be very colorful. |
| How long is the email? | Shorter emails are more likely to be marketing. |
| How many explicit terms are used throughout the email body? | Marketing emails tend to not include explicit terminology. |
| What is the sender domain age? | Newly registered domains tend to be suspicious. |
| Can we represent the entire email body with text embeddings? | If we capture semantic relations within an email body, we can differentiate email categories based on how the email was written. |
What Quality Checks Is Expel Doing?
Expel uses multiple quality control (QC) processes to ensure Expel AME does what it’s supposed to do, which is to only auto-close benign marketing emails.
Expel randomly turns off auto-close for a sample of highly probable marketing emails, and redirects these emails to be triaged by senior analysts. This is done to compare the decision made by a senior analyst to how the model triaged the email. We use these comparisons to monitor our model and the results it’s producing.
Expel uses Arthur AI to watch the model in production. With Arthur AI, we look at the inputs and outputs of the model to see if there’s a drift in the data. Data drift is a leading indicator of potential performance degradation of the model.