My Fandom Wikis to Obsidian Vaults Converter (Python Code Included)

2 months ago

The other day, I had the genius idea of creating fanfictions using AI. I had a few anime series in mind, but it didn't really matter which one. It's a "Just for fun"* idea... For that, I needed a lot of data on the characters, setting, and an AI model good at creative writing with a long context length.

For the data part, I decided to collect data from Wikia/Fandom.com pages, and since taking the whole page will waste a lot of AI Tokens, I wanted a fast way to crawl the website and only take the data I wanted.

Method:

Ideally, I want to be able to create a text file containing only the relevant data for the fanfiction I'm creating... So, here's the method I arrived at:

Crawl the whole Wiki for the series I'm looking at. (Example: Dragon Ball)
Convert the crawled pages into Markdown Files suitable for [Obsidian] Notes App
Use Obsidian's Embeddings Plugins, and ChatGPT Plugin, to create an even more focused .txt file containing only the info I need to pass to the AI

For the crawling part, I used Crawl4AI's Docker App, which needs a Pyhton code to call it. I used AI to create a code suitable for crawling Fandom.com websites and removing redundant parts of the page. The result are a batch of markdown files that I put in my Obsidian vault for further refining...

Pre-requites

Crawl4AI Docker Version: https://github.com/unclecode/crawl4ai
Obsidian: https://obsidian.md/
Obsidian Smart Connections Plugin: https://github.com/brianpetro/obsidian-smart-connections

The Code

So, here's my current code. It's working but kind of clunky, and since I vibe-coded it using AI, I'm not sure how all of it works, (I only understand the basics,) so further editing would be a nightmare...

Still, I decided to share it, just in case anyone is interested:

Disclaimer: Code created using Qwen3-235B model over 1 hour of conversation. The result are the two files below:

bulk_fandom_crawler.py:

The code below is called via terminal: python3 bulk_fandom_crawler.py WEBSITEURL --max-crawl PAGESCOUNT

import csv
import os
import re
import requests
from urllib.parse import urlparse, urljoin
from bs4 import BeautifulSoup
from crawl_to_mkdn import Crawl4AiCrawler
import time
import logging
from typing import Set, List, Tuple

class BulkFandomCrawler:
    def __init__(self, base_url: str = "http://localhost:11235"):
        self.crawler = Crawl4AiCrawler(base_url)
        self.links_csv = "fandom_links.csv"
        self.finished_csv = "finished_crawled.csv"
        self.output_root = "crawled_sites"
        self.log_file = "crawler_logs.txt"
        
        # Initialize logging
        self._setup_logging()
        
        # Initialize CSV files
        self._init_csv_files()
    
    def _setup_logging(self):
        """Setup logging to file and console."""
        logging.basicConfig(
            level=logging.INFO,
            format='%(asctime)s - %(levelname)s - %(message)s',
            handlers=[
                logging.FileHandler(self.log_file, encoding='utf-8'),
                logging.StreamHandler()
            ]
        )
        self.logger = logging.getLogger(__name__)
    
    def _init_csv_files(self):
        """Initialize CSV files with headers if they don't exist."""
        # Initialize links CSV
        if not os.path.exists(self.links_csv):
            with open(self.links_csv, 'w', newline='', encoding='utf-8') as f:
                writer = csv.writer(f)
                writer.writerow(['URL', 'Topic_Title', 'Subdomain', 'Status'])
            self.logger.info(f"📄 Created new links CSV: {self.links_csv}")
        
        # Initialize finished CSV
        if not os.path.exists(self.finished_csv):
            with open(self.finished_csv, 'w', newline='', encoding='utf-8') as f:
                writer = csv.writer(f)
                writer.writerow(['URL', 'Topic_Title', 'Subdomain', 'Crawl_Date', 'Success', 'Word_Count', 'File_Size'])
            self.logger.info(f"📄 Created new finished CSV: {self.finished_csv}")
    
    def _get_finished_urls(self) -> Set[str]:
        """Get set of already crawled URLs from finished CSV."""
        finished_urls = set()
        if os.path.exists(self.finished_csv):
            with open(self.finished_csv, 'r', encoding='utf-8') as f:
                reader = csv.DictReader(f)
                for row in reader:
                    finished_urls.add(row['URL'])
        return finished_urls
    
    def _extract_title_from_url(self, url: str) -> str:
        """Extract topic title from fandom URL."""
        parsed = urlparse(url)
        path_parts = parsed.path.strip('/').split('/')
        if 'wiki' in path_parts:
            wiki_index = path_parts.index('wiki')
            if wiki_index + 1 < len(path_parts):
                title = path_parts[wiki_index + 1]
                # Replace underscores with spaces and decode URL encoding
                title = title.replace('_', ' ')
                return title
        return "Unknown"
    
    def _get_subdomain(self, url: str) -> str:
        """Extract subdomain from fandom URL."""
        parsed = urlparse(url)
        domain_parts = parsed.netloc.split('.')
        if len(domain_parts) >= 3 and 'fandom.com' in parsed.netloc:
            return domain_parts[0]
        return "unknown"
    
    def discover_links(self, start_url: str) -> List[Tuple[str, str, str]]:
        """
        Discover fandom links from a starting URL.
        Returns list of (url, title, subdomain) tuples.
        """
        self.logger.info(f"🔍 Discovering links from: {start_url}")
        
        try:
            # Use requests to get the page content for link discovery
            response = requests.get(start_url, timeout=30)
            response.raise_for_status()
            soup = BeautifulSoup(response.content, 'html.parser')
            
            # Extract subdomain from start URL
            start_subdomain = self._get_subdomain(start_url)
            
            # Find all links
            links = []
            for link in soup.find_all('a', href=True):
                href = link['href']
                
                # Convert relative URLs to absolute
                if href.startswith('/'):
                    href = urljoin(start_url, href)
                
                # Check if it's a fandom wiki link from same subdomain
                if self._is_valid_fandom_link(href, start_subdomain):
                    title = self._extract_title_from_url(href)
                    subdomain = self._get_subdomain(href)
                    links.append((href, title, subdomain))
            
            # Remove duplicates
            unique_links = list(set(links))
            self.logger.info(f"✅ Found {len(unique_links)} unique fandom links")
            return unique_links
            
        except Exception as e:
            self.logger.error(f"❌ Error discovering links from {start_url}: {str(e)}")
            return []
    
    def _is_valid_fandom_link(self, url: str, target_subdomain: str) -> bool:
        """Check if URL is a valid fandom wiki link from the target subdomain."""
        try:
            parsed = urlparse(url)
            
            # Must be fandom.com domain
            if 'fandom.com' not in parsed.netloc:
                return False
            
            # Must be from same subdomain
            if self._get_subdomain(url) != target_subdomain:
                return False
            
            # Must be a wiki page
            if '/wiki/' not in parsed.path:
                return False
            
            # Exclude certain pages
            excluded_patterns = [
                'Special:', 'File:', 'Category:', 'Template:', 'User:', 'Talk:',
                'action=', 'oldid=', '#', '?'
            ]
            
            for pattern in excluded_patterns:
                if pattern in url:
                    return False
            
            return True
            
        except Exception:
            return False
    
    def add_links_to_csv(self, links: List[Tuple[str, str, str]]):
        """Add discovered links to the main links CSV."""
        existing_urls = set()
        
        # Read existing URLs
        if os.path.exists(self.links_csv):
            with open(self.links_csv, 'r', encoding='utf-8') as f:
                reader = csv.DictReader(f)
                for row in reader:
                    existing_urls.add(row['URL'])
        
        # Add new links
        new_links = 0
        with open(self.links_csv, 'a', newline='', encoding='utf-8') as f:
            writer = csv.writer(f)
            for url, title, subdomain in links:
                if url not in existing_urls:
                    writer.writerow([url, title, subdomain, 'pending'])
                    new_links += 1
        
        self.logger.info(f"➕ Added {new_links} new links to {self.links_csv}")
    
    def get_pending_urls(self) -> List[Tuple[str, str, str]]:
        """Get URLs that haven't been crawled yet."""
        finished_urls = self._get_finished_urls()
        pending_urls = []
        
        if os.path.exists(self.links_csv):
            with open(self.links_csv, 'r', encoding='utf-8') as f:
                reader = csv.DictReader(f)
                for row in reader:
                    if row['URL'] not in finished_urls and row['Status'] == 'pending':
                        pending_urls.append((row['URL'], row['Topic_Title'], row['Subdomain']))
        
        return pending_urls
    
    def crawl_url(self, url: str, title: str, subdomain: str) -> bool:
        """Crawl a single URL and record the result."""
        try:
            self.logger.info(f"🕷️ Crawling: {url}")
            self.crawler.crawl_and_save(url, self.output_root)
            
            # Calculate stats for the crawled content
            word_count, file_size = self._get_content_stats(url)
            
            # Record success in finished CSV
            with open(self.finished_csv, 'a', newline='', encoding='utf-8') as f:
                writer = csv.writer(f)
                writer.writerow([url, title, subdomain, time.strftime('%Y-%m-%d %H:%M:%S'), 'True', word_count, file_size])
            
            self.logger.info(f"✅ Successfully crawled: {title} ({word_count} words, {file_size} bytes)")
            return True
            
        except Exception as e:
            self.logger.error(f"❌ Failed to crawl {url}: {str(e)}")
            
            # Record failure in finished CSV
            with open(self.finished_csv, 'a', newline='', encoding='utf-8') as f:
                writer = csv.writer(f)
                writer.writerow([url, title, subdomain, time.strftime('%Y-%m-%d %H:%M:%S'), 'False', 0, 0])
            
            return False
    
    def _get_content_stats(self, url: str) -> Tuple[int, int]:
        """Get word count and file size for crawled content."""
        try:
            # Parse URL to find the output file
            parsed = urlparse(url)
            site_dir = parsed.netloc
            path = parsed.path.strip("/")
            
            full_path_dir = os.path.join(self.output_root, site_dir, path)
            filename = "index.txt" if not path else f"{path.split('/')[-1]}.txt"
            output_file = os.path.join(full_path_dir, filename)
            
            if os.path.exists(output_file):
                file_size = os.path.getsize(output_file)
                with open(output_file, 'r', encoding='utf-8') as f:
                    content = f.read()
                    word_count = len(content.split())
                return word_count, file_size
            
        except Exception as e:
            self.logger.warning(f"⚠️ Could not calculate stats for {url}: {str(e)}")
        
        return 0, 0
    
    def bulk_crawl(self, max_urls: int = None, delay: float = 1.0):
        """Perform bulk crawling of pending URLs."""
        pending_urls = self.get_pending_urls()
        
        if not pending_urls:
            self.logger.info("📭 No pending URLs to crawl")
            return
        
        if max_urls:
            pending_urls = pending_urls[:max_urls]
        
        self.logger.info(f"🚀 Starting bulk crawl of {len(pending_urls)} URLs")
        
        success_count = 0
        total_words = 0
        total_size = 0
        
        for i, (url, title, subdomain) in enumerate(pending_urls, 1):
            self.logger.info(f"\n📄 [{i}/{len(pending_urls)}] Processing: {title}")
            
            if self.crawl_url(url, title, subdomain):
                success_count += 1
                # Get stats for this crawl
                word_count, file_size = self._get_content_stats(url)
                total_words += word_count
                total_size += file_size
            
            # Add delay between requests
            if delay > 0 and i < len(pending_urls):
                time.sleep(delay)
        
        self.logger.info(f"\n🎉 Bulk crawl completed!")
        self.logger.info(f"📊 Results: {success_count}/{len(pending_urls)} successful")
        self.logger.info(f"📝 Total words: {total_words:,}")
        self.logger.info(f"💾 Total size: {total_size:,} bytes")
    
    def start_crawling(self, start_url: str, discover_new: bool = True, max_crawl: int = None):
        """Main method to start the crawling process."""
        self.logger.info("🚀 Starting Bulk Fandom Crawler")
        self.logger.info(f"🌐 Start URL: {start_url}")
        
        # Show current status
        self._show_status()
        
        # Validate start URL
        if not self._is_valid_fandom_link(start_url, self._get_subdomain(start_url)):
            self.logger.error("❌ Invalid fandom URL provided")
            return
        
        # Check Crawl4AI health
        try:
            health = requests.get("http://localhost:11235/health", timeout=10)
            if health.status_code != 200:
                self.logger.error("❌ Crawl4AI service not healthy")
                return
            self.logger.info("✅ Crawl4AI health check passed")
        except requests.exceptions.RequestException:
            self.logger.error("❌ Could not connect to Crawl4AI. Please start the Docker container:")
            self.logger.error("   docker run -p 11235:11235 ghcr.io/unclecode/crawl4ai:latest")
            return
        
        # Discover new links if requested
        if discover_new:
            links = self.discover_links(start_url)
            if links:
                self.add_links_to_csv(links)
        
        # Show updated status
        self._show_status()
        
        # Start bulk crawling
        self.bulk_crawl(max_urls=max_crawl)
    
    def _show_status(self):
        """Display current crawler status."""
        pending_count = len(self.get_pending_urls())
        finished_count = len(self._get_finished_urls())
        
        self.logger.info("📊 Current Status:")
        self.logger.info(f"   📋 Pending URLs: {pending_count}")
        self.logger.info(f"   ✅ Finished URLs: {finished_count}")
        self.logger.info(f"   📁 Output directory: {self.output_root}")
        self.logger.info(f"   📄 Links CSV: {self.links_csv}")
        self.logger.info(f"   📄 Finished CSV: {self.finished_csv}")
        self.logger.info(f"   📝 Log file: {self.log_file}")

def main():
    import sys
    
    if len(sys.argv) < 2:
        print("Usage: python bulk_fandom_crawler.py <START_URL> [--no-discover] [--max-crawl N]")
        print("Example: python bulk_fandom_crawler.py https://mushokutensei.fandom.com/wiki/Roxy_Migurdia")
        print("Options:")
        print("  --no-discover: Skip link discovery, only crawl existing pending URLs")
        print("  --max-crawl N: Limit crawling to N URLs")
        sys.exit(1)
    
    start_url = sys.argv[1]
    discover_new = '--no-discover' not in sys.argv
    max_crawl = None
    
    # Parse max-crawl option
    if '--max-crawl' in sys.argv:
        try:
            max_idx = sys.argv.index('--max-crawl')
            if max_idx + 1 < len(sys.argv):
                max_crawl = int(sys.argv[max_idx + 1])
        except (ValueError, IndexError):
            print("❌ Invalid --max-crawl value")
            sys.exit(1)
    
    # Start crawling
    crawler = BulkFandomCrawler()
    crawler.start_crawling(start_url, discover_new=discover_new, max_crawl=max_crawl)

if __name__ == "__main__":
    main()

crawl2obsidian.py:

#!/usr/bin/env python3
"""
Convert crawled Fandom pages to clean Obsidian markdown vault.
Keeps images but strips all URL links while preserving text content.
"""

import os
import re
from pathlib import Path

def clean_fandom_content(content):
    """Clean fandom content by removing links but keeping images and text."""
    
    # Keep images but remove their URLs - convert ![alt](url) to ![alt]
    content = re.sub(r'!\[([^\]]*)\]\([^)]+\)', r'![\1]', content)
    
    # Remove fandom links but keep the display text
    # Pattern: [text](https://mushokutensei.fandom.com/wiki/Page "Page")
    content = re.sub(r'\[([^\]]+)\]\(https://[^.]+\.fandom\.com/[^)]+\)', r'\1', content)
    
    # Remove internal navigation links like [1 Receptionist](url#section) but keep text
    content = re.sub(r'\[[\d\.]+ ([^\]]+)\]\([^)]+#[^)]+\)', r'\1', content)
    
    # Remove edit links [[](auth.fandom.com...)]
    content = re.sub(r'\[\[\]\([^)]+\)\]', '', content)
    
    # Remove citation links like [[1]](url) but keep the number
    content = re.sub(r'\[\[(\d+)\]\]\([^)]+\)', r'[\1]', content)
    
    # Remove "Jump up" reference links
    content = re.sub(r'\[↑\]\([^)]+\s+"[^"]*"\)', '', content)
    content = re.sub(r'↑ \[Jump up to: [^\]]+\]', '', content)
    
    # Remove remaining markdown links but keep text
    content = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', content)
    
    # Remove HTML-like tags
    content = re.sub(r'<[^>]+>', '', content)
    
    # Remove table separators
    content = re.sub(r'^---\s*$', '', content, flags=re.MULTILINE)
    
    # Remove "Sign in to edit" text
    content = re.sub(r'"Sign in to edit"', '', content)
    
    # Remove lines that are just numbers (like "1/2", "1/3")
    content = re.sub(r'^\d+/\d+\s*$', '', content, flags=re.MULTILINE)
    
    # Remove "Voiced by:" lines with complex formatting
    content = re.sub(r'Voiced by:.*?(?=\n##|\n\n|\Z)', '', content, flags=re.DOTALL)
    
    # Clean up navigation sections at the end
    content = re.sub(r'## Navigation.*?$', '', content, flags=re.DOTALL)
    
    # Remove expand sections
    content = re.sub(r'Expand\[.*?\].*?(?=\n##|\n\n|\Z)', '', content, flags=re.DOTALL)
    
    # Clean up multiple empty lines
    content = re.sub(r'\n\s*\n\s*\n+', '\n\n', content)
    
    # Remove empty lines at start and end
    content = content.strip()
    
    return content

def is_empty_content(content):
    """Check if content is empty or only contains '(No content extracted)'."""
    cleaned = content.strip()
    return (
        not cleaned or
        cleaned == "(No content extracted)" or
        len(cleaned) < 10  # Very short content is likely meaningless
    )

def remove_empty_directories(directory):
    """Recursively remove empty directories."""
    removed_count = 0
    
    # Walk bottom-up to remove empty directories
    for root, dirs, files in os.walk(directory, topdown=False):
        for dir_name in dirs:
            dir_path = os.path.join(root, dir_name)
            try:
                # Try to remove if empty
                os.rmdir(dir_path)
                removed_count += 1
                print(f"Removed empty directory: {dir_path}")
            except OSError:
                # Directory not empty, skip
                pass
    
    return removed_count

def find_page_groups(input_dir):
    """Find groups of files that should be consolidated into single pages."""
    page_groups = {}
    all_files = list(input_dir.rglob('*.txt'))
    
    # First, identify potential main pages and their subpages
    for txt_file in all_files:
        relative_path = txt_file.relative_to(input_dir)
        path_parts = relative_path.parts
        
        # Look for main page files (at directory level, not in subdirectories)
        if len(path_parts) >= 2:
            parent_dir = relative_path.parent
            filename = relative_path.stem
            
            # Check if there are subdirectories with the same base name as this file
            potential_subpage_dirs = []
            for other_file in all_files:
                other_relative = other_file.relative_to(input_dir)
                other_parts = other_relative.parts
                
                # Check if this other file is in a subdirectory of the same parent
                # and the subdirectory name could be a subpage of our main file
                if (len(other_parts) >= 3 and
                    other_relative.parent.parent == parent_dir and
                    other_relative.stem == other_parts[-2]):  # filename matches its directory
                    
                    # Check if this could be a subpage of our main file
                    subpage_dir = other_parts[-2]
                    if subpage_dir.lower() in ['appearance', 'relationships', 'story', 'chronology',
                                             'gallery', 'powers_and_abilities', 'future']:
                        potential_subpage_dirs.append(other_file)
            
            # If we found potential subpages, create a group
            if potential_subpage_dirs:
                group_key = f"{parent_dir}/{filename}"
                if group_key not in page_groups:
                    page_groups[group_key] = {'main': txt_file, 'subpages': potential_subpage_dirs}
    
    # Find standalone files (not part of any group)
    grouped_files = set()
    for group in page_groups.values():
        grouped_files.add(group['main'])
        grouped_files.update(group['subpages'])
    
    standalone_files = [f for f in all_files if f not in grouped_files]
    
    return page_groups, standalone_files

def process_page_group(group, input_dir, output_dir):
    """Process a group of related files into a single markdown file."""
    main_file = group['main']
    subpages = sorted(group['subpages'], key=lambda x: x.name)
    
    combined_content = []
    
    # Process main file first
    try:
        with open(main_file, 'r', encoding='utf-8') as f:
            main_content = f.read()
        
        if not is_empty_content(main_content):
            cleaned_main = clean_fandom_content(main_content)
            if not is_empty_content(cleaned_main):
                combined_content.append(cleaned_main)
    except Exception as e:
        print(f"Error reading main file {main_file}: {e}")
    
    # Process subpages
    for subpage in subpages:
        try:
            with open(subpage, 'r', encoding='utf-8') as f:
                content = f.read()
            
            if not is_empty_content(content):
                cleaned_content = clean_fandom_content(content)
                if not is_empty_content(cleaned_content):
                    # Use the subpage filename as section header
                    section_name = subpage.stem.replace('_', ' ')
                    combined_content.append(f"\n\n# {section_name}\n\n{cleaned_content}")
        except Exception as e:
            print(f"Error reading subpage {subpage}: {e}")
    
    if combined_content:
        # Create output path
        relative_path = main_file.relative_to(input_dir)
        filename_with_spaces = relative_path.stem.replace('_', ' ')
        output_path = relative_path.parent / f"{filename_with_spaces}.md"
        output_file = output_dir / output_path
        
        # Create output directory if needed
        output_file.parent.mkdir(parents=True, exist_ok=True)
        
        # Write combined content
        final_content = '\n\n'.join(combined_content)
        with open(output_file, 'w', encoding='utf-8') as f:
            f.write(final_content)
        
        return True, len(group['subpages']) + 1  # +1 for main file
    
    return False, 0

def process_crawled_files():
    """Process all crawled .txt files and convert to clean markdown."""
    
    input_dir = Path('crawled_sites')
    output_dir = Path('obsidian_vault')
    
    # Create output directory
    output_dir.mkdir(exist_ok=True)
    
    processed_count = 0
    skipped_count = 0
    consolidated_count = 0
    
    # Find page groups for consolidation
    print("Analyzing file structure for consolidation...")
    page_groups, standalone_files = find_page_groups(input_dir)
    
    print(f"Found {len(page_groups)} page groups to consolidate")
    print(f"Found {len(standalone_files)} standalone files")
    
    # Process consolidated page groups
    for group_key, group in page_groups.items():
        try:
            success, file_count = process_page_group(group, input_dir, output_dir)
            if success:
                processed_count += 1
                consolidated_count += file_count
                print(f"Consolidated: {group['main'].relative_to(input_dir)} (+{len(group['subpages'])} subpages)")
            else:
                skipped_count += file_count
        except Exception as e:
            print(f"Error processing group {group_key}: {e}")
            skipped_count += len(group['subpages']) + 1
    
    # Process standalone files
    for txt_file in standalone_files:
        try:
            # Read original content
            with open(txt_file, 'r', encoding='utf-8') as f:
                content = f.read()
            
            # Skip if content is empty or meaningless
            if is_empty_content(content):
                print(f"Skipped empty file: {txt_file.relative_to(input_dir)}")
                skipped_count += 1
                continue
            
            # Clean the content
            cleaned_content = clean_fandom_content(content)
            
            # Double-check if cleaned content is now empty
            if is_empty_content(cleaned_content):
                print(f"Skipped file with no useful content after cleaning: {txt_file.relative_to(input_dir)}")
                skipped_count += 1
                continue
            
            # Create output path maintaining directory structure
            relative_path = txt_file.relative_to(input_dir)
            
            # Convert underscores to spaces in filename for easier linking
            filename_with_spaces = relative_path.stem.replace('_', ' ')
            output_path = relative_path.parent / f"{filename_with_spaces}.md"
            output_file = output_dir / output_path
            
            # Create output directory if needed
            output_file.parent.mkdir(parents=True, exist_ok=True)
            
            # Write cleaned content
            with open(output_file, 'w', encoding='utf-8') as f:
                f.write(cleaned_content)
            
            processed_count += 1
            print(f"Processed: {relative_path}")
            
        except Exception as e:
            print(f"Error processing {txt_file}: {e}")
    
    # Remove empty directories
    print("\nRemoving empty directories...")
    removed_dirs = remove_empty_directories(output_dir)
    
    print(f"\nCompleted!")
    print(f"Processed: {processed_count} files")
    print(f"Consolidated: {consolidated_count} files into page groups")
    print(f"Skipped: {skipped_count} empty files")
    print(f"Removed: {removed_dirs} empty directories")
    print(f"Output directory: {output_dir.absolute()}")

if __name__ == "__main__":
    process_crawled_files()

Posted Using INLEO

hive-163521 wiki crawling anime programming python technology stem cent slothbuzz

0.000

19 comments

@holoz0r 77

2 months ago

Hey, I use Obsidian too, just not in that way. I use the calendar plug in to schedule my writing and make sure that I am writing every day :)

It is a great piece of software!

0.000

@ahmadmanga 74

2 months ago

I wanted to do that for a while too... Especially the writing part, but I usually do it on the browser instead. !LOLZ !PIZZA

0.000

@lolzbot 71

2 months ago

lolztoken.com

Did anyone hear the rumor about butter?
Don’t worry I won’t spread it.
_{Credit: reddit}
@holoz0r, I sent you an $LOLZ on behalf of ahmadmanga

(2/4)
Delegate Hive Tokens to Farm $LOLZ and earn 110% Rewards. Learn more.

0.000

@savvyplayer 64

2 months ago

The note taking app that you use appears to be good, though its freemium model where many basic features (not necessarily those that will cost the provider over time, such as for hosting sizable files) require a payment of $5 to $10 per month (with a 20% yearly discount). 💸🤯😅 !WEIRD !INDEED !HOPE !PIZZA !HUG

0.000

@ahmadmanga 74

2 months ago

For my purposes, the Free version is enough for me.

0.000

@savvyplayer 64

2 months ago

I wonder which other note-taking apps you have considered (and tried to use), and what you could say about them. 🤓 !INDEED !LOLZ

0.000

@lolzbot 71

2 months ago

lolztoken.com

Have you heard the one about the jump rope?
Never mind, skip it.
_{Credit: reddit}
@ahmadmanga, I sent you an $LOLZ on behalf of savvyplayer

(3/10)
Delegate Hive Tokens to Farm $LOLZ and earn 110% Rewards. Learn more.

0.000

@ahmadmanga 74

2 months ago

Hmmm... Not specifically a note-taking app, but I used Trello for a while~

0.000

@savvyplayer 64

2 months ago

I have read a lot of online reviews about that app, that say that its company doesn't care much about its free users !INDEED. 🤯🤯🤓 !WEIRD !INDEED !HOPE !PIZZA

0.000

@ahmadmanga 74

2 months ago

Yeah, I started noticing that more and more...

0.000

@savvytester 61

2 months ago

I wonder how accurate what I also read online was, such that a paid account on that platform can go so far as to (quickly) take over any free account there easily. 🤔🤯🤯 !WEIRD !INDEED !HOPE !HUG

0.000

@ahmadmanga 74

2 months ago

Wait, take over?! What does that mean?

0.000

@savvytester 61

2 months ago

I don't know how exactly it is done when it was explained by the reviewers of the provider, but basically, a paid account can put a free account under it and gain access to most (if not all) of the content of the free account. 🤔🤯 !WEIRD !INDEED !HOPE !PIZZA

0.000

@ahmadmanga 74

2 months ago

Wow, that's evil if it's all!

0.000

@savvytester 61

2 months ago

It would be better if you would search about that specific type of incident yourself, even through indirect means such as simply searching for reviews about the company online. 🤔🧘‍♂️🤓 !HOPE !INDEED !WEIRD !PIZZA

0.000

@pizzabot 59

2 months ago (Edited)

PIZZA!

$PIZZA slices delivered:
ahmadmanga tipped holoz0r
savvyplayer tipped ahmadmanga (x2)
@savvytester_(1/5) tipped @ahmadmanga (x2)

_{Come get MOONed!}

0.000

@amr008 68

2 months ago

love how your find_page_groups bundles character subpages like appearance and relationships into one clean note, that’s the kind of glue that saves context tokens. The link stripping that keeps display text and images makes the vault readable, and and your CSV tracking with success flags keeps runs sane even if a crawl hiccups. This feels like a tidy pipeline from fandom chaos to Obsidian, turning teh wiki spaghetti into dinner.

0.000

@ahmadmanga 74

2 months ago

Yeah, I needed to do that and I'm glad the AI managed to code it for me.

0.000

@amr008 68

2 months ago

totally fair. Splitting the bulk crawl and the Obsidian cleanups was the right call and and it shows. That consolidation step is defnitely saving tokens while keeping context tidy.

0.000