During my tenure as a Salesforce consultant, I often find myself tasked with reviewing file storage within clients’ Salesforce instances.
This arises from many clients relying solely on Salesforce as their primary mode of communication with customers, whether through sales or support channels.
Consequently, numerous attachments and files are saved on emails exchanged between users, creating file links behind the scenes — linking users, emails, and even the originating objects. This rapid accumulation can swiftly push your file storage beyond the 10-gigabyte standard limit allowed in Salesforce. It’s essential to note that file storage is distinct from data storage.
Today, I’d like to shed some light on how files are stored in the background of Salesforce and offer insights on how to query your Salesforce instance to pinpoint where the bulk of your files are being stored. This builds upon my previous tips, where I discussed leveraging SOQL for analyzing your Salesforce instance and swiftly querying data that Salesforce reports may not readily provide access to. Let’s dive in and explore these methods together.
Files in Salesforce: An Overview
In Salesforce, a file is referred to as a Content Document (or ContentDocument using the API Name). Every file created in Salesforce starts as a ContentDocument, but there are additional records created as a result of this process: ContentVersion and ContentDocumentLink are two key entities that I frequently query.
ContentVersion serves as the object Salesforce utilizes when a file is altered. For instance, if you attach “file1.txt” and later modify and re-save it, a new ContentVersion with the same ContentDocumentId is generated. However, the ContentDocument’s LatestPublishedVersionId will be updated accordingly.
On the other hand, ContentDocumentLink manages the visibility of your file (ContentDocument). This aspect can be intricate to explain; for instance, when you upload a file onto an Object record, two Content Document Links are established—one for the record and another for you (the User), populated in the LinkedEntityId field of Content Document Link.
To better illustrate how these components seamlessly interact, I find the Entity Relationship Diagram (ERD) below particularly illuminating:
Image Credit: https://driveconnect.me/blog/contentdocument-contentversion-and-contentdocumentlink/
Why Understand the Data Model?
It is important to understand how the data model is set up so that you can query these records and assess where all your data is being held.
Overcoming Query Challenges
I have had many battles trying to get my head around querying all the ContentDocumentLink records for an object, for example:
SELECT Id
FROM ContentDocumentLink
WHERE LinkedEntityId LIKE ‘006%’
If you run this query, you will receive a lovely error message:
MALFORMED_QUERY: Implementation restriction: ContentDocumentLink requires a filter by a single Id on ContentDocumentId or LinkedEntityId using the equals operator or multiple Id’s using the IN operator.
I then discovered that this worked:
SELECT Id, LinkedEntityId, ContentDocument.ContentSize
FROM ContentDocumentLink
WHERE LinkedEntityId IN (SELECT Id FROM Opportunity)
This gave me a way to quickly return the amount of space each of my objects’ files were taking up, which is handy if you are considering a third-party file exporter.
Considerations like this may be important regarding pricing. If you think you only need to map one object instead of all objects within Salesforce, then you may be able to negotiate a better rate.
Why Consider Implications of Deletion?
Another observation to consider, is if you wished to delete or move these files yourself, what is the implication on the 3 objects that we have discussed.
Most importantly if you delete the Content Document (File) you will delete the Content Document Link and the Content Version, if you delete a Content Document Link you will lose visibility of the document from the record it was once associated, and you cannot delete an active Content Version.
If you wanted to create Content Documents through a mass insert, you would do this through the Content Version, counter intuitively the production of a Content Version produces a Content Document, subsequently you would have to also create ContentDocumentLinks if there were specific records you wished to link these files with, I have found this very common with records such as notes.
There is a lot to discover with the “Content” world and I encourage everyone to have a try, if you have mass documents you need to create, remove or delete it will be beneficial to you to understand this particular data model.
I hope this small brief gives you the tools to get you started and as always,
May the Salesforce be with you!