This guide helps you optimize your al-folio website for search engines so your research and work are discoverable.
- SEO Best Practices Guide
SEO (Search Engine Optimization) makes your website discoverable on Google, Bing, and other search engines. For academics, this means:
- Your research becomes discoverable when people search for your work
- Your CV/bio appears in search results
- Your publications rank higher
- More citations and collaborations
al-folio includes SEO basics, but you can optimize further.
al-folio auto-generates a sitemap.xml and robots.txt for you. These tell search engines what pages exist.
Verify they exist:
- Visit
https://your-site.com/sitemap.xml– Should show an XML list of pages - Visit
https://your-site.com/robots.txt– Should show instructions for search engines
If they're missing:
- Check
_config.ymlhas a validurl - Rebuild:
bundle exec jekyll build - Check
_site/directory has both files
No configuration needed – al-folio handles this automatically.
Ensure _config.yml has correct metadata:
title: Your Full Name or Site Title
description: > # Brief description (1-2 sentences)
A description of your research and expertise.
This appears in search results.
author: Your Name
keywords: machine learning, research, academia, etc.
url: https://your-domain.com
lang: enAll fields are important for SEO. Avoid leaving fields blank.
When someone shares your page on Twitter, Facebook, LinkedIn, etc., Open Graph controls what preview appears.
Without Open Graph:
- Generic title
- No image
- Ugly preview
With Open Graph:
- Your custom title
- Your custom image (photo, diagram, etc.)
- Custom description
- Professional preview
Open Graph is disabled by default. To enable:
-
Edit
_config.yml:serve_og_meta: true # Change from false to true og_image: /assets/img/og-image.png # Path to your image (1200x630px recommended)
-
Create your OG image:
- Size: 1200x630 pixels
- Format: PNG or JPG
- Content: Your name/logo + key info
- Save to:
assets/img/og-image.png
-
Commit and deploy
-
Test it:
- Use Facebook's Sharing Debugger
- Paste your site URL
- You should see your custom image and title
Per-page OG images:
Add to the frontmatter of a blog post or page:
---
layout: post
title: My Research Paper
og_image: /assets/img/paper-diagram.png
---Schema.org is structured data that tells search engines what kind of content is on your page:
- "This is a Person" (your bio page)
- "This is a Publication" (your paper)
- "This is a BlogPosting" (your article)
Benefits:
- Rich snippets in search results
- Better knowledge graph information
- Schema validation helps Google understand your site
Enable in _config.yml:
serve_schema_org: true # Change from false to trueThat's it! al-folio automatically marks up:
- Author info (Person schema with name, URL, photo)
- Blog posts (BlogPosting schema with date, title, description)
- Publications (CreativeWork/ScholarlyArticle schema)
Homepage (Person):
- Your name, photo, description
- Links to your profiles (LinkedIn, GitHub, etc.)
Blog posts (BlogPosting):
- Title, date, author, description
- Content
- Publication date and modified date
Publications (ScholarlyArticle):
- Title, authors, abstract
- Publication date, venue
- URL and PDF links
Google Search Console lets you monitor how your site appears in Google search results.
Setup:
- Go to Google Search Console
- Add your website:
- Click "URL prefix"
- Enter your site URL:
https://your-domain.com
- Verify ownership (choose one method):
- HTML file upload – Download file, add to repository root
- HTML tag – Copy meta tag to
_config.yml→ redeploy - Google Analytics – If you already use Google Analytics
- DNS record – Advanced (if you own the domain)
Add to _config.yml:
google_site_verification: YOUR_VERIFICATION_CODE(Replace YOUR_VERIFICATION_CODE with the code from Search Console.)
Monitor in Search Console:
- Performance – Which queries bring traffic, your ranking position
- Coverage – Any indexing errors
- Enhancements – Schema.org validation
- Sitemaps – Your sitemap status
Similar to Google Search Console but for Bing search:
- Go to Bing Webmaster Tools
- Add your site
- Verify (usually auto-verifies if you verified Google)
- Add to
_config.yml:bing_site_verification: YOUR_BING_CODE
Note: Bing commands are optional but recommended. Check both console dashboards regularly.
Goal: Get your publications listed on Google Scholar so they show up in scholar search results.
Google Scholar auto-crawls:
- Your website automatically (if publicly accessible)
- Your publications page if it has proper markup
- PDFs linked from your site
To improve Scholar indexing:
-
Ensure BibTeX has proper format:
@article{mykey, title={Your Paper Title}, author={Your Name and Co-Author}, journal={Journal Name}, year={2024}, volume={1}, pages={1-10}, doi={10.1234/doi} }
-
Add PDFs to BibTeX:
@article{mykey, # ... other fields ... pdf={my-paper.pdf} # File at assets/pdf/my-paper.pdf }
-
Submit to Google Scholar (optional):
- Go to Google Scholar Author Profile
- Create a profile
- Google will find your papers automatically within weeks
-
Wait 3-6 months – Google Scholar takes time to index
If your research is computer science related:
- Go to DBLP
- Search for yourself or your papers
- If missing, Submit via DBLP (requires account)
- DBLP will verify and add your work
If you have preprints:
- Go to arXiv.org
- Submit your preprint
- Once listed, arXiv automatically indexes it across search engines
Add arXiv link to BibTeX:
@article{mykey,
# ... other fields ...
arxiv={2024.12345} # arXiv ID
}Every page needs a title and description. These show in search results.
In _config.yml:
title: Jane Smith - Computer Science Researcher
description: >
Academic website of Jane Smith, focusing on machine learning and AI ethics.In page/post frontmatter:
---
layout: post
title: Novel Deep Learning Architecture for Climate Modeling
description: A new approach to improving climate model accuracy with deep learning
---Checklist:
- Title under 60 characters (so it doesn't get cut off)
- Description 120-160 characters
- Include your name in the site title
- Include keywords naturally
Use proper HTML heading hierarchy for both SEO and accessibility:
# H1: Main Page Title
Use one H1 per page, usually your blog post or page title
## H2: Section Heading
### H3: Subsection
### H3: Another subsection
## H2: Another SectionBenefits:
- Search engines understand your content structure
- Screen readers can navigate better
- Visitors can scan your content
For SEO:
- Use descriptive filenames:
neural-network-architecture.png(notimg1.png) - Add alt text (also helps accessibility):

For performance:
- Optimize image file size (use tools like TinyPNG)
- Use modern formats (WebP instead of large JPGs)
- Responsive images (different sizes for mobile vs desktop)
Link between your own pages strategically:
See my [publication on climate AI](./publications/) or my [blog post on neural networks](/blog/2024/neural-networks/).Benefits:
- Search engines crawl through your links
- Users discover more of your content
- Distributes "authority" across your site
al-folio auto-generates an RSS feed at /feed.xml.
Why RSS matters:
- Content aggregators pick up your posts
- Researchers can subscribe to your updates
- Improves discoverability
Ensure your feed works:
# In _config.yml
title: Your Site
description: Your site description
url: https://your-domain.com # MUST be complete URLTest your feed:
- Visit
https://your-site.com/feed.xml - Should show XML with your recent posts
- Try subscribing in a feed reader (Feedly, etc.)
Search engines favor fast, mobile-friendly sites.
Check your site:
- Use Google PageSpeed Insights
- Enter your site URL
- Review recommendations
- al-folio already optimizes for performance, but you can improve further:
- Compress images
- Minimize CSS/JS (enabled by default)
- Use lazy loading (already enabled)
Mobile optimization:
- al-folio is responsive by default
- Test on phones/tablets
- Ensure buttons are large enough to tap
- Check readability on small screens
Before considering your site "SEO optimized":
Basic Setup:
-
_config.ymlhastitle,description,author,url - Sitemap accessible at
/sitemap.xml -
robots.txtaccessible at/robots.txt - Mobile-friendly (test on phone)
Search Console:
- Google Search Console linked
- Bing Webmaster Tools linked (optional but recommended)
- No major indexing errors
- Sitemaps submitted
Schema/Open Graph:
-
serve_og_meta: true(for social sharing) -
serve_schema_org: true(for structured data) - Test OG with Facebook Debugger
- Validate schema at Schema.org Validator
Content:
- Every page has unique title (under 60 chars)
- Every page has description (120-160 chars)
- Blog posts have proper dates
- Images have descriptive alt text
- Headings follow proper hierarchy
Publications:
- BibTeX entries have proper format
- PDFs linked from BibTeX
- Submitted to Google Scholar (optional)
- Indexed on DBLP or arXiv (if applicable)
Performance:
- Site loads under 3 seconds (check PageSpeed)
- No broken links (use lighthouse or similar)
- RSS feed works (check
/feed.xml)
- Google Search Central – Official SEO guide
- Moz SEO Checklist – Beginner-friendly guide
- Google PageSpeed Insights – Performance analysis
- Schema.org – Structured data reference
- WebAIM – Accessibility (helps SEO too)
- Lighthouse Audit – Browser extension
Next Steps:
- Enable Open Graph and Schema.org in
_config.yml - Set up Google Search Console and Bing Webmaster Tools
- Optimize your page titles and descriptions
- Add alt text to images and PDFs to your BibTeX
- Monitor search console regularly for indexing issues
Your research will be more discoverable with these optimizations! 🔍