Facebook Page Follower Scrap (view page source)
Overview
This script is an updated version of facebook page follower scrap. This script will get more profile information through the view page source. This script automates the process of scraping Facebook profile information including:
Profile names
Facebook IDs
Profile links
Email addresses
Phone numbers
The data is collected from Facebook mobile app via Appium and stored in a Google Sheet.
Main Different
The difference between original version and updated version is that updated versions have been added in to get email and phone numbers from the view page source. Beside, ID also has been updated to be obtained from the view page source.
Below is updated code:
def get_facebook_details(original_url):
try:
# Setup Chrome options
chrome_options = Options()
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--headless=new")
driver = webdriver.Chrome(options=chrome_options)
# Visit the original URL
driver.get(original_url)
time.sleep(5)
final_url = driver.current_url
parsed = urlparse(final_url)
base_url = f"{parsed.scheme}://{parsed.netloc}{parsed.path}"
# Clean the profile link
clean_url = base_url
# Get page source for detailed scraping
page_source = driver.page_source
# Extract Facebook user ID
user_id = None
id_match = re.search(r'"userID":"(\d+)"', page_source)
if id_match and id_match.group(1) != '0':
user_id = id_match.group(1)
print(f"✅ Extracted ID: {user_id}")
# Extract email addresses
emails = set()
email_matches = re.finditer(EMAIL_REGEX, page_source)
for match in email_matches:
email = match.group().lower()
if not email.endswith('facebook.com') and not email.endswith('fb.com'):
emails.add(email)
# Extract phone numbers
phones = set()
phone_matches = re.finditer(PHONE_REGEX, page_source)
for match in phone_matches:
phone = match.group(1).strip()
# Clean and standardize the format
phone = re.sub(r'[^\d\+]', '', phone) # Remove all non-digit/non-plus characters
phone = phone.lstrip('0') if phone.startswith('0') and len(phone) > 10 else phone
# Validate it's a proper phone number (at least 8 digits)
if sum(c.isdigit() for c in phone) >= 8:
phones.add(phone)
driver.quit()
return {
"url": clean_url,
"id": user_id,
"emails": list(emails)[:3] if emails else None,
"phones": list(phones)[:3] if phones else None
}
except Exception as e:
print("❌ Error in get_facebook_details:", e)
return None
Configuration
Format of email and phone number:
EMAIL_REGEX = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b'
PHONE_REGEX = r'"text":"((?:\+\d{1,3}[-.\s]?)?\(?\d{1,4}\)?[-.\s]?\d{1,4}[-.\s]?\d{1,9})"'
Execution:
To run this script: python facebook_scrapper updated.py
Last updated