My data-driven blog series has mysteriously disappeared. We’ve had a really eventful year so far, and it’s been thrilling to unveil so much groundbreaking information at “Now You Know 2017.” (our user conference in Denver). A few people have asked me where I’ve been, so now that the dust has calmed, I’ll be discussing the fascinating projects being undertaken by our engineers.
Because we are placing an emphasis on openness, there is a chance that I may miss something or provide incorrect information on a particular aspect. Think of what you read here more as my random thoughts at the moment than as any kind of formal advertising.
1. First, do you have access to the Facebook Firehose?
So, I lied; this isn’t exactly a frequent inquiry. This is a common inquiry for Instagram, but I think most professionals recognise the challenge of obtaining Facebook data. I decided to intentionally pose the question to myself since the answer provides important background information for those who are unfamiliar with the subject.
Twitter’s primary revenue source is not just advertising, but data sales as well. The open nature of Twitter is a major contributor to its effectiveness, thus there are no major ethical problems with this. Twitter chats are, for the most part, open to the public. This is especially simple with Twitter because we can use tools like Firehose, which provide unrestricted access to their data, at will.
2. It may be challenging, but… So, how do we get in?
It’s not that horrible, though, because we still have access to a substantial amount of information. Our current Facebook data protection options include:
If you have a Facebook page that you own, we may access its activity feed, including any posts, comments, and analytics you’ve set up, such as the number of likes. Campaign and performance metrics, community management, and insights are some of the most common applications of this technology.
Facebook pages that you don’t personally manage may nevertheless be monitored for activity and analytics like page likes. The most common applications for this include tracking and comparing performance to industry standards, learning about the competition, and interacting with fans on social media.
Unexpected Facebook Coverage: Creating Facebook channels also adds that additional data to your usual Brandwatch searches, which is a hidden but welcome perk. Some people employ this strategy as a means of artificially expanding their reach on Facebook.
3. What are the limits to collecting information from Facebook, and why are they there?
To reiterate what was said previously, Facebook does not provide a Firehose of all of their data for us to use. Not only is there no search API, but it’s also more difficult to simply ask Facebook to deliver us all data on a certain topic than it is to browse the public APIs to receive posts and comments for Facebook pages.
Yet, there are two primary resources that must be managed for this to be successful. While the first may seem simple, it is really rather significant given the sheer number of pages our clients have come to expect: Computing power. In order to keep up with consumer demand, we need a sufficient number of servers to crawl all new content are engaged. We have some say in this matter. That is to say, we can simply add more computational resources, or purchase additional capacity, as needed.
4. Do you have any suggestions on how I might increase our presence on Facebook?
We don’t have many requirements for creating a channel at the moment; you may add as many as you like. But, as you’ll see in the rest of this post, authenticating using numerous Facebook accounts can greatly improve your crawls. Especially if those accounts also have access to the administrative controls for the pages you intend to crawl.
5. Can you elaborate on this incidental Facebook coverage issue that I keep hearing about…?
While it’s true that we can’t perform completely arbitrary searches throughout Facebook using an API, there is a workaround.
Once added to the archive, channel data may be accessed using the same Brandwatch Analytics queries as the rest of our social data.
This allows you to build channels in the app to collect information from industry-specific Facebook sites, which can then be text matched to a more refined search.
If you’re in the mobile phone industry, for instance, you might wish to monitor the web for mentions of your own or rival brands. While we can’t directly query Facebook, we can set up channels for prominent tech and phone sites’ Facebook pages (such as The Verge, Engadget, Android Central, etc.). Any posts or comments from the channels that include words matching your query will be downloaded and presented in your dashboards alongside the Twitter, News, Web, Instagram, and Reddit data.