Tuesday, April 01, 2014

Run Your Research Demo Site on the Cloud

Last week, Travis Hance and I spent hours wading through the many blog posts of the internet to figure out how to set up a simple website on Amazon EC2 using our Jeeves language, which runs on Python and C++. Because we want to spare you this trouble, we put together this definitive* post for people who want to run the simplest possible research demo site on Amazon EC2. We cover the following:
  1. How to set up an Amazon EC2 instance and SSH to it (to the install and configure whatever you like).
  2. How to set up and configure an Apache web server on your Amazon EC2 instance.
  3. How to set up your database and what to do if you want to host your own database on your Amazon EC2 instance.
  4. How to configure virtual hosts on your Apache web server if you want to use the same server to host different projects on different subdomains.
This post assumes you have experience using Django and testing things on your local machine. We're using Django 1.6.2 with Apache 2.4.9. These instructions are tailored for an Ubuntu instance, but they probably generalize as well.

Is Amazon EC2 for me?

The first thing to do is to determine whether you need to run your own EC2 server. Amazon's Elastic Compute Cloud (EC2) gives you elastic compute in the cloud. The biggest win is you can easily change how much capacity you have with minimal friction. It's also just a nice way to host servers without managing your own physical machines.

If you just need vanilla Django hosting, then you should probably find some other hosting service that can manage things for you. In our case, we wanted to use the Z3 SMT solver, which runs on C++, so we needed to run our own server.

Fellow CSAILers may be interested to learn that I have also set up a mirror site on our department's OpenStack cloud. This is free for people in our department and is useful if you don't need permanent cloud data storage.

How to set up a cloud instance.

Once you decide you want to set things up on EC2, it's pretty easy to get started. As of the time we signed up, there is a free Linux tier that gives you 750 hours at no cost. Amazon recently announced further price cuts, so the situation may be even more exciting by now. To set up your own Amazon EC2 server, sign up here and follow the i nstructions for launching a new instance.

SSHing to your EC2 instance.

In order to SSH to your instance, you will need to set the permissions of your servers to allow this. You can do this by going to your EC2 management console and adding your IP address (or all IP address if you want to live on the edge) to the "Inbound" list of allowed SSH addresses.

You'll also have to use an RSA key, which you should have generated sometime during the setup. Go to the "Instances" tab under your console to get the public DNS name. Then you can SSH to your instance:

ssh -i [location of your RSA private key] [username]@[public DNS name]

For Ubuntu instances, the username is "ubuntu."

 

Installing software.

Congratulations! You now have root access on an EC2 instance. You have the freedom to install software the way you would on any other machine. You can check out a copy of your code, as well as everything you need to run it, this way.

How to run a web server.

We'll be describing how to use the Apache HTTP web server for serving websites off your machine. To run your server, first download Apache and the WSGI (Web Server Gateway Interface) module for interfacing with Python programs.

sudo apt-get install apache2 libapache2-mod-wsgi

Once you have done this, you should be able to access the Apache configuration file in /etc/apache2/apache2.conf. This file tells a webpage how to interact with Apache, by describing for instance how paths should be resolved.

To make sure your Apache server knows about your demo project, first you'll want to set your Python path and alias your / path to wherever your WSGI configuration file is.

WSGIScriptAlias / /home/ubuntu/srv/testproject/testproject/wsgi.py
WSGIPythonPath /home/ubuntu/srv/testproject

You'll also want to add "Alias" entries for the static/ and media/ directories:

Alias /static /home/ubuntu/code/jeeves/demo/conf/static
Alias /media /home/ubuntu/code/jeeves/demo/conf/media

Finally, you'll want to add a "Directory" entry to set the permissions for the directory where you'll be serving your Python files from.

<Directory /usr/share>
  AllowOverride None
  Require all granted
</Directory>

To put these changes into effect, restart your Apache server:

sudo /etc/init.d/apache2 restart

You'll also want to change the permissions of your static/ and media/ directories to make them owned by the www-data group.

sudo chown -R www-data:www-data path/to/static/
sudo chown -R www-data:www-data path/to/media/

Now everything should work! Go to your hostname in the browser and see for yourself. Okay, so it is likely that there were some configuration errors and you get a "Bad request" or other error. When this happens, it is helpful to check your Apache error log, which can be found in /var/log/apache2/error.log.

Oh, and for Apache configuration files: a gotcha is that order matters, so for redirects you should put the most specific first and the most general last. A consequence of this gotcha is that if you have aliased '/' and you already have a Directory entry for '/', you need to move this to be after the Directory entry for the directory aliased to '/'.

 

Setting up your database.

If your Django application uses a database, you'll want to hook that up as well. Django has pretty good documentation for how to edit your settings.py for the database of your choice. You may need to install Python-specific libraries for interfacing with these databases. For instance, for MySQL you will want to install the python-mysqldb Ubuntu package. Once you have configured your database settings, running "syncdb" will set up your tables:

python manage.py syncdb

We found that our site ran much faster if we hosted the database locally. We followed the standard instructions for installing and running a MySQL database. For those who have never done this before, here is what you should expect to do:
  1. Install MySQL server.
  2. Configure your server by, for instance, setting a password for the root user.
  3. Start your MySQL server.
  4. Create a new MySQL database for use by your web application.
EC2-related: if you want to be able to access your database through SSH from other hosts (for instance, to back up your database from elsewhere), you will need to add a SQL entry to your security settings permitting access from the allowed IP address(es).

 

Getting ready for production.

Now you are ready to go! For your website to look the most professional, you will want to set DEBUG = False in your settings.py file. Once you do this, you will need to make sure the ALLOWED_HOSTS list includes your domain. An easy way to do this is to add the host '*' to the list.

And make sure the secret key you use in production is secret! 

 

How to host multiple projects on one server.

You might want to serve multiple demos, each with their own Django projects. There are a couple of ways to do this. One is to do the appropriate aliasing in your Apache configuration file for different subdirectories (For instance, example.com/project1.). If you go this route, you will have to make sure your redirects, includes, etc. point to the right place.

Another option, the one we took, is to use virtual hosts to put each project on its own subdomain. Here is how to add each new virtual host:
  1. Add a VirtualHost entry to your /etc/apache2/sites-available/[site name].conf file. For the main site the file is 000-default.conf.
  2. Enable this site:
    a2ensite [site name]  
  3. Reload your Apache configuration:
    sudo /etc/init.d/apache2 reload
    
Here is my VirtualHost configuration for jconf.jeeves.csail.mit.edu that lives in my /etc/apache2/sites-available/jconf.jeeves.csail.mit.edu.conf file. This post is getting long so I'm getting too lazy to explain all the parts, but you can see how I'm specifying paths, aliases, listening on port 80, and all that good stuff.

<VirtualHost *:80>
    ServerName jconf.jeeves.csail.mit.edu
    DocumentRoot /home/ubuntu/code/jeeves/demo/conf

    WSGIDaemonProcess jconf processes=5 threads=1
    WSGIScriptAlias / /home/ubuntu/code/jeeves/demo/conf/wsgi.py
    ErrorLog /var/log/apache2/jconf-error.log

    Alias /static /home/ubuntu/code/jeeves/demo/conf/static
    Alias /media /home/ubuntu/code/jeeves/demo/conf/media
    Alias /logs /home/ubuntu/code/jeeves/demo/conf/logs

    <Directory /home/ubuntu/code/jeeves/demo/conf>
      <Files wsgi.py>
        Order deny,allow
        Allow from all
      </Files>
    </Directory>
</VirtualHost>

Note that if you want things to run on subdomains, 1) you will need to use your own domain (rather than Amazon EC2's dynamically assigned DNS) and 2) you need to make sure you have DNS entries for the subdomains (you need to tell someone which IP addresses you would like for these subdomains to resolve to). There are instructions here about setting up your own domain name with EC2. Instructions for mapping subdomains will vary based on domain manager. (For CSAIL domains created with WebDNS, you can create subdomains by editing your hostname file and adding aliases for your subdomains.)

 

A final word.

There are a lot of details (version numbers; deprecation; death) involved with these web things, but it is so satisfying to get everything working. And if at first you don't succeed, try, try, try again.

* This claim is intended to be tongue-in-cheek. I had told Travis that there was so much misinformation on the internet that I wanted to write the definitive blog post. He laughed because this sentiment surely motivates every other post out there.

Tuesday, January 07, 2014

Careful Where You Click

The other day, a friend and I were talking about how fun it was to check Google Analytics. When I asked her if she knew most of her readers, she said she could figure out who many of them were. Especially if they were outside New York. Especially if they were in some remote location.

Google Analytics, a fantastic tool for optimizing your website, can also be a precise tool for stalking. All you need to do is insert a bit of code in the header and you can know who has visited your website, how they came across your website, how long they visited your site, and whether they have been to your site before. I show on the left a screen shot of analytics on the visitors from Florida for one of my websites*. Since there is only one person accessing the site from each of Gainesville and South Miami, we can figure out how much time each person spent on the site. As my friend Rishabh would say, "Walk the walks, stalk the stalks."

We've recently seen that even if you use tools that supposedly hide you, there are still ways you can be tracked. Last month, Special Agent Thomas Dalton released this document of how he tracked down that the bomb threats to Harvard buildings during exam originated from a Harvard student hoping to avoid exams. To avoid detection, the student had used Tor, software that prevents others from detecting the IP address from which an action originates--akin to anonymizing a phone number before making a prank call. Unfortunately for the student, the Harvard network was able to figure out who was accessing Tor at the time the e-mail was sent, narrowing the suspect pool down to our perpetrator.

There are many ways that you can be exposed if you are among a small handful of people engaging in some behavior. Last month, I attended a fascinating talk by Mike Specter at MIT about how people can track searches of obscure terms. Mike had done a search for "Pentagon SMO code" and discovered there was a gov.cn (Chinese government) website showing up towards the top of the search results. He clicked on this website, he discovered that it showed its own search results for the terms--but not before tracking that he had visited the site**. Upon further investigation, Mike discovered that the site had performed search-engine optimization so that it could show up towards the top of the search results. Because search engines rely on automated algorithms, people can trick the algorithms into thinking their page is more important or more relevant to a given search term. This page used particularly insidious optimization techniques, including putting invisible comments called "pingbacks" on legitimate websites (such as the Yale University homepage) and using the .gov.cn domain so as to show up in all searches and not just searches in China. (Google personalizes results based on search location.) Mike found that this website showed up for many long-tail (obscure) searches, both relevant to the American government and otherwise. While it's not clear what this site is doing with this information (Mike speculates profit motives), what is clear is that people can easily be exposed while performing obscure searches.

While we can hope that researchers like Mike continue to watch out for us, we will also need to learn to watch our for ourselves. As our lives come to depend on increasingly complex technology, it is important for everyone to develop a basic level of technological literacy. It is clear that technology can be quite powerful in reducing and compromising internet privacy. It is up to us to define how much we allow: by raising awareness, by taking precautions, and by protesting when we find something to be unreasonable. As recent legislative decisions have shown, powerful people have been doing a lot of thinking about the future of the internet. It is up to us to make sure the future is not one in which we have no choice but to be exposed!

* Not this blog, guys. Florida readers, you remain anonymous here! 
** How much the site can find out about you depends on the level of sophistication of the site and the measures you have taken to hide yourself. The WSJ has a nice widget at the top that tells you what they can find out about you. I show mine below.

Wednesday, January 01, 2014

2014 is the Year of Laughing like Kafka

This year, I'm trying this trend where I have a theme instead of a resolution. And this isn't just because I broke all of my concrete resolutions by April of last year...

For 2014, the theme is to laugh like Kafka. Franz Kafka, who wrote the most dark and wonderful stories, would encourage everyone to laugh at the absurdly dark situations of his protagonists. During readings, apparently he would laugh so jovially and in such contrast with the grim content that people would be confused. Instead of feeling stressed or angry or scared, I want to laugh like Kafka at the absurdity of my own life.

It's been getting harder and harder to not take life too seriously. My mother is always telling me, "Aren't you getting a little old to have blogs where you take pictures of yourself wearing funny glasses?" (See here and here.) Junior Ph.D. students keep saying to me, "Aren't you really old? Why aren't you more serious?" And people are always saying, "You went to Harvard and MIT. Shouldn't you be, like, really serious?" Even though I'm getting really old and I subscribe to The Economist, this does not mean I need to have a permanent scowl.

There are many things to laugh about. How the mental health of Ph.D. students can be measured in terms of the number of days, and sometimes hours, in between existential crises. The fact that there exist contraptions called epilators that have dozens of tiny tweezers for pulling out body hair. That time I spent the most money ever on a tasting menu meal and then spent the next day incapacitated, conducting several important meetings via phone from bed, purging my digestive system in between. Misogynists, racists, xenophobes, and homophobes. Plagues of frogs, locusts, darkness, and death of the firstborn...

Laughing does not mean being disrespectful or apathetic. It simply involves seeing a situation in a way that is less weighty and overwhelming. So whenever I am looking less than happy in 2014, ask whether I should be laughing instead.

Wednesday, December 18, 2013

I'm Using Python Now

I have a confession. All these years I've been evangelizing strongly statically typed languages, I've been going home at night and using Python. Not all the time. It's more like the hungry vegetarian graduate student who comes to the free lunch and discovers only meat dishes. Sometimes you have to do what you have to do.

It started with small things here and there. Because of its libraries, Python has been my go-to language for web scraping jobs. Because of its relative concision to Java and its relative principles to PHP, Python has become my go-to language for web backend programming. And as the web is getting fancier and fancier, I've been doing more of this. As the web has been getting fancier, my research has also involved more of this. Web backend programming. And also Python.

It's true. I'm now doing research programming in Python*. For my PhD, I've been developing Jeeves, a new programming language (and soon-to-be web framework) for automatically enforcing privacy policies. Jeeves makes the programmer's life easier by making the language runtime responsible for keeping track of the privacy policies policies. As this approach happens completely at run-time rather than compile-time right now, it is a great fit for embedding into Python, which does most things dynamically (at run time). And so we switched from the Scala implementation we had been maintaining for a couple of years to Python.

The initial reason for switching to Python was that it is my favorite popular language for web backends. Popularity matters: a web framework is a big piece of software. Lots of people using it usually corresponds to lots of people developing and maintaining it, as well as more documentation for how to use it. I had found both of these to be issues with Scala web frameworks: the two popular Scala web frameworks, Lift and Scalatra, have learning curves that are quite steep. The people-power behind development, maintenance, and documentation also means that it is likely that lots of people will continue using it.

Another reason for the switch was that Python is what the (MIT undergraduate) kids are learning these days. I had thought that I could push Scala upon undergraduate research assistants using the force of my charisma, but this is harder than it might sound. It can take upwards of half a semester to teach even a motivated, bright undergraduate Scala. And given that undergraduate tend to be around for about a semester, that does not leave much productive work time.

After switching to Python, I learned that playing with language features in Python is much faster than in Scala. Much of the time spent coding in the Scala implementation involved an intricate dance with Scala's type system. While it's thrilling never knowing if your new tricks will allow you to save yourself from writing type-casing boilerplate, it takes a lot of time to convince the Scala type-checker that you're doing something reasonable in all cases. Sure, people might object that writing in Python is like swimming without a life jacket or something, but if you're an experienced swimmer trying out different moves in a small body of water, a life jacket is just going to hinder you. Same effect for experienced programmers prototyping a research language...

Something else that I've discovered is that people outside of programming languages seem much more excited about using my language when I tell them it is embedded in Python. These people include computational biologists, computer scientists working in areas other than programming languages (systems; the web), and undergraduates who are trying to work on our research project. A computational biochemist I spoke to who has been using logic programming was quite excited that we were embedding our language in Python. There are already many biological modelling toolkits written in Python, he said, so he could easily envision people picking up our tool.

I'm not saying everyone should use Python instead of Scala. I still stand by everything I say in my other blog post praising Scala: Scala is a less pretty version of the ML family of languages that is potentially way more usable because of its interoperability with Java. If you're trying to do quick-and-dirty language prototyping or trying to build a web framework, however, Scala may not be the tool for the job.

I know I'm not going to be able to check my Python programs before I run them. And I know I'm never going to be able to run my programs super fast. But I've looked into the tradeoff space and made my choice. Good thing Python is for girls.

* In the past few decades, the fashion has been to conduct programming languages research using statically typed languages with strong type systems. The idea is that since we have the compute power to check our programs before our run them, it is irresponsible not to. Thus using something like Python is considered somewhat scandalous.

Wednesday, October 23, 2013

Dual Booting Ubuntu and Windows 8

This blog post commemorates the better part of a work day I spent installing a dual boot of Ubuntu Linux 13.10 and Windows 8 on my Lenovo X230 Thinkpad. I did not expect to have such trouble, but I did, and once I came out in the open about it on Facebook and Twitter I heard from many others who have fought the good fight--and given up. So maybe this post will be helpful to some of you.

If you're looking for quick instructions for adding an Ubuntu partition to a machine with Windows 8 installed, here they are:
  1. Manually partition your memory on your Windows 8 operating system. Go to the Control Panel, go to Disk Management, and create a new partition from your main partition. (More information here.)
  2. Acquire an Ubuntu installation mechanism, either from Ubuntu or by downloading a disk image and burning it onto a flash drive or CD.
  3. When starting up your computer, press "Enter" to interrupt the normal boot sequence. Press F12 to get into the boot menu. Select the option to boot from the flash/CD drive. Follow the directions to install Ubuntu.
  4. The Ubuntu install will mess up your boot loader and prevent you from loading Windows properly. To fix this, install Boot Repair on your Linux system and select "Recommended Repair." This will reinstall your GRUB and do some other things.
  5. The final piece of what you'll need to do is disable Secure Boot in your BIOS. Windows 8 uses it to make sure the pre-OS environment is secure. You can do it by pressing "Enter" at startup, getting into the BIOS options, and selecting "Disable" for the Secure Boot option. (More here on Secure Boot and here on disabling it.)
Read on for the full story.

The first question to address is why the dual boot. Linux is non-negotiable for coding. Besides feeling somewhat like it would be a waste to wipe out Windows, Windows is pretty useful for programs like Powerpoint, SolidWorks, and the new software that came with my drawing tablet (Autodesk, ArtRage, and Photoshop). Why don't I just get a Mac, you might ask. Maybe I'm waiting for free software to get good enough that I don't have to use Windows anymore. Or maybe I just haven't... yet...

The second question to address is why the dual boot and not virtual machines. At one point I was running a Windows virtual machine on Linux to use Powerpoint. And then my Windows decided to install updates... for ten minutes... during the beginning of my Research Qualifying Exam. After that, I decided dual boot was the better way to go. I hear having Windows as the host is better, but I mostly spend my time in Linux anyway. Windows is just for the special stuff every now and then.

Now for the story. I went on the Ubuntu website, downloaded a disk image, and burned it onto a disk as I normally do, expecting to be able to boot off the disk as usual. Ha. I tried turning on my computer a couple of times, thinking the system would detect the disk and boot off of it. That didn't work, I had thought because the boot order was not in my favor. I then wondered if I could take the easy route and use WUBI, the Ubuntu Windows installer, which was also part of the Ubuntu disk image. It looked like I was getting a little bit far in my Ubuntu installation: the system told me it succeeded and I even got a pretty "which OS do you want, Ubuntu or Win 8?" screen. But every time I tried to select the "Ubuntu" option I got a black screen of death saying my \ubuntu\winboot\wubuildr.mbr file was missing. A quick internet search revealed that WUBI is not to be used in conjunction with Windows 8 or UEFI hardware. (Okay, here I'll admit I tried reburning the CD at least once before doing this...) Apparently WUBI doesn't work with UEFI, the Universal Extensible Firmware Interface, because it uses grub4dos, which doesn't support GPT (GUID partition table) disks, which is a more flexible disk partitioning mechanism associated with UEFI.

I tried a little harder to boot off the disk and discovered that pressing "Enter" got me out of the normal boot sequence and F12 allowed me to boot off the disk. It looked like I successfully installed Linux again, until I shut down and tried to enter my Windows partition. There I got an equally scary screen saying my Windows couldn't be accessed anymore. I searched the error on Google and it said that I could probably address my boot issues using Boot Repair. I installed it, and during installation it reinstalled the GRUB (GNU GRand Unified Bootloader) and told me I needed to disable Secure Boot. Wondering if the second part was really true, I tried starting up Windows without disabling Secure Boot. No luck. It turns out that if Secure Boot is enabled, Windows 8 expects it to report back on certain properties that the dual boot breaks. (More here.) I went into the BIOS, found the Secure Boot [Enabled] option on the right-most screen, and set it to [Disabled]. (Apparently Linux systems can support Secure Boot now, but--unless I'm missing something--not for dual booting.)

And... then... it... worked! Now I have a working dual-boot of Windows 8 and Ubuntu 13.10. My usage of Windows 8 has been an endless source of amusement for my office friend Rishabh. ("Why do you have to do this just to get that to work?") Perhaps this can be the subject of a future blog post.

Wednesday, September 18, 2013

Why I'm Not Taking a Vacation from Facebook

The internet does not seem to like Facebook these days. Studies are coming out (for instance, this PLOS ONE study) suggesting that Facebook decreases happiness in young adults. In solidarity with the teenagers, Kayak founder Paul English is taking a vacation from it the month of October.

This is too bad. I don't hate social media. Like any of you I also like going into the mountains, throwing my phone into a lake, and bonding over shared processing of primal angst. But I wouldn't be the most happy doing only that. I thrive on being connected to hundreds of people at once. And you might, too.

I have always loved the internet for making this possible. As a kid with diverse interests and not that many people to talk to about them, the internet was a way for me to have the conversations I wanted to have. I have had an e-mail account since 1995. I may have read every page on the internet about the USA women's gymnastics team during the 1996 Olympics. I got a lot of flak for running my own GeoCities website about tamagotchis--complete with pop-ups and frames--in middle school. During my teenage years, when I became only slightly cooler, my social life consisted mostly, to my mother's chagrin, of spending my evenings with at least five AOL Instant Messenger chat windows open while "doing my homework." I talked to friends from my school, friends from other schools, friends I met at summer camp, and friends of friends who were interesting to talk to. If only there were a way for me to do this more efficiently...

When Facebook first came out, I was excited that everyone else could join me in having an active online life. As part of the first Harvard class ('08) to have Facebook accounts before arriving on campus, we all spent the later parts of our summers stalking our soon-to-be classmates. By the first week of freshman year, it was rare to come across someone who had not already established, through judging self-manufactured personas on Facebook, who was hot, who was not, and who was planning to take way too many classes. And sure, at times it was a bit overwhelming to be able to browse just how much smarter, better-looking, and popular other people seemed to be. But that's college. Insecurity is inevitable, especially on a campus where it seems like every other person is jumping to tell you how early they got up, how many miles they ran, or how smart their boyfriend was. And you learn to calibrate for the "Facebook gap:" people are probably adding a couple of inches to their height, taking a few pounds of their weight, and lying about their age... oh wait, that's online dating.

Even in the beginning, Facebook was useful for facilitating deeper connections. In late August before our freshman year, one of the Harvard websites had a glitch that allowed us to see our room assignments, which normally would not be available until we arrived on campus. News of this glitch spread through the mailing list for the incoming class (another useful virtual community) and many of us posted our room assignments to Facebook. This is how four of my five freshman roommates and I found each other and began corresponding. By the time we had gotten to school, we already knew where the others were from, what our backgrounds and habits were like, and what our hopes and dreams were for our freshman year. This helped us establish a rapport--as well as real memories we still refer back to--before we were able to meet in person. It may not be a coincidence that despite being quite different on the surface, the four of the five of us continued living together for the rest of college.

Social media, by supporting the broadcast-and-see-what-comes-back method of social interaction, has enriched my life in many ways over the years. When I was younger and less busy, I would announce on Facebook whenever I was going to be in a different city so I could meet up with whichever friends happened to be there at the time. One time, Facebook helped me reconnect with a Korean-American friend I had not seen since we met at art camp in high school who was randomly teaching English in my Chinese hometown. Another time, my Facebook-location-announcement scheme facilitated a San Francisco hang-out with a childhood friend who introduced me to a friend who introduced me to a friend--via Facebook--who ended up showing my friend and me a fun evening in Barcelona. By giving me a forum for announcing my intentions to the world, Facebook has made it easy for the world to help me achieve what I want, whether it is having a discussion about some topic or acquiring some physical object. Facebook has given me all sorts of things: product and app recommendations (I learned about InstaPaper through Facebook), link suggestions, and even roommate invitations.

Because of how easy it makes it to access interesting information, social media has come to dominate my media consumption. For the media diary we were supposed to keep for a class last semester on the news ecosystem, I discovered that I spent 78.6 hours in conversations and it was my primary form of media consumption. (I had a pretty insane spreadsheet for tracking this...) Of my conversations, 27.6% occurred on social media. One could wonder whether I am spending my time gossiping when I could be reading the news, by let me convince you that this is not the case. On Facebook and Twitter, I like to follow people (for instance, Arianna Huffington) and organizations (for instance, Forbes Tech News) who post informative pieces. I also actively unfollow people who flood my stream with posts I don't care about. I have recently also begun following Facebook pages that provide a steady stream of positive quotes, for instance Positivity. Social media has made it much easier to do two of my favorite things: read about the world and have conversations about what I read.

You might ask whether this time I am spending on Facebook is taking time away from forming genuine connections. I don't think so. Much of the time I spend on Facebook is time I would otherwise use for working or reading, activities that are not particularly social. If I weren't having a conversation on social media, I would likely not be having a conversation at all--in-person conversations during work breaks take far higher levels of coordination and serendipity than the asynchrony of social media requires. For me, taking a break to go on Facebook or Twitter is like walking down a hallway full of exactly whom I want to see. For some kinds of work, it's also nice to have Facebook providing a warm background buzz. Of course it's good to have one-on-one conversations, but sometimes it's nice to be able to go to a coffee shop or party and experience human interaction secondhand. And just as it's not the best idea to do all of your work in the busiest coffee shop in town, you probably do not want to be constantly connected to all 3000 of your best Facebook friends.

So sure, if you are a misanthrope or agoraphobe it's probably best to stay off, but social media does not have to be bad. It may take some establishment of good practices, but what worthwhile activity doesn't?

Tuesday, July 23, 2013

Travel Meets Technology: A Weekend in Portland, Oregon

When Facebook Graph Search came out, skeptics bemoaned the end of personal privacy. Now that people can perform targeted queries over your social media history, there are few places to left hide. Facebook Graph Search will make it all too easy for your mother to find out that you have been drinking and for your boss to find out that you are a Republican.

The unveiling of Graph Search was exciting to me for more than voyeuristic reasons. For years, I have appreciated how Facebook has helped better connect me with people and enrich my world view. I have also liked how social media democratizes knowledge and allows for everyone to express their views in ways only experts and journalists previously could. Graph Search seemed like a useful way to learn even more about the people in my social network and also the rest of the world.

To explore my stance, I decided to see if Graph Search was useful for researching topics other than my friends’ personal lives. I came up with the goal of researching a lifestyle piece using solely Graph Search. As the target destination I chose Portland, Oregon because I had never been there and did not have many friends posting about it on Facebook. I assumed having few contacts in a location would be the most common experience of someone using Graph Search to learn about something new, as most people’s social networks tend to be fairly limited. I then planned a trip to Portland to compare the Graph Search itinerary to the New York Times’s 36 Hours in Portland, which represents an expert-curated itinerary for the same length of time.

This project became not just an evaluation of the effectiveness of Facebook Graph Search, but a study of the changing relationship between people, technology, and experts. On the one hand, technology provides us tools to scour the internet’s fares and opinions to potentially provide us with succinct summaries of the world’s information. On the other hand, it is not clear whether someone familiar with the domain in question can outperform technology. There is also the question of how we can use technology to enhance interactions with experts or to democratize the availability of expert knowledge. To explore this, I used other tools such as AirBnB, Bing Travel, and non-Graph Search capabilities of Facebook. The final itinerary is a result of cross-referencing the Graph Search and Times itineraries with sites like Yelp and with Portland locals, combined with serendipitous events.

I found my ad-hoc internet travel agent to be immensely useful for giving me fast and easy access to large quantities of information. Fare search and predictors give even amateur trip planners a good idea of times and prices at which to buy. Graph Search allowed me to see a collage of what my trip could look like visually through a simple search of “Photos taken in Portland, Oregon.” Sites like Graph Search, Yelp, and AirBnB provided up-do-date information about where people where spending their time and money. Advances in search technology, combined with innovations in peer-to-peer models for communication, allow us to learn about the world in previously impossible ways. With all this information at hand, it is tempting to think that the masses provide us with all of the opinions we need.

During this project, however, I learned to appreciate the curation of experts. Someone local or known to have good taste is more likely to make good recommendations than a random sampling of people from the internet. In general, unless you know something about the people posting about a place, it is difficult to determine how much to listen to the opinions presented. I realized that with sites such as Yelp and AirBnB, I close-read the writing style and content of reviews to form my own opinion about how “expert” I deem a reviewer about the relevant domain. It is, at present, difficult to determine the taste “footprint” of Facebook users posting about a place. Especially for researching travel and entertainment, it would be useful to be able to identify and weigh more highly the contributions of certain people, for instance relative experts or those with similar taste.

I describe the results in the form of a timeline for both the planning and trip periods.

Some night, months before.
12am.
Use only Facebook Graph Search to research a travel itinerary for Portland, Oregan (see According to Facebook Graph Search: 36 Hours in Portland, Oregon and Facebook Graph Search as a Journalistic Tool).

The next day.
Post to Facebook, Twitter, and your Gmail chat status about these posts. Chat online with a friend on the West Coast until he suggests a trip to Portland to explore these itineraries. Use Bing travel's fare predictor to decide the best time to book plane tickets. (In my case, it told me to wait.)
Tools: social media; online messaging; fare search.

Some other night, a month before.

11pm.
Plot geographic locations corresponding to both the Graph Search and the New York Times (NYT: 36 Hours in Portland, Oregon) itineraries. Discover that the Graph Search itinerary seems to be concentrated around downtown (West) Portland, while the NYT itinerary has locations on both sides of the Wilamette River. Discover that many of the places mentioned in the NYT article, written in August 2011, have already closed.
Tools: Google Maps; Google Places.

12am.
Examine the availability of people renting available apartments, rooms, and guest houses on AirBnB. Discover that the more interesting, affordable, and highly rated locations seem to be in East Portland. Make a booking in the Mt. Tabor neighborhood.
Tools: AirBnB.

Yet another night before the trip.
Make a proposed itinerary for the weekend by combining activities from the two itineraries. Cross-reference with Yelp; discover that many of the Graph Search locations seem to be reviewed less favorably as being "touristy." Cross-reference with TripAdvisor; find that the recommended activities seem to be less urban and more outdoorsy.

Use Facebook to contact Joe, a friend of a friend who lives in Portland. Make plans to meet up.
Tools: Yelp; TripAdvisor; messaging.

Friday of the trip.

11am.
Brunch at Cafe Broder, whose Scandinavian brunch comes highly recommended by the Times. Broder is worth every minute of the potentially long wait. Try the lefke (potato pancake) and a baked egg scramble.
Tools: New York Times.

2:30pm.
Walk off brunch by shopping in the vintage and curiosity shops. Walk past Pok Pok, one of the most popular Asian restaurants in Portland.
Tools: serendipity.

4pm.
Get a late lunch at Por Que No? Taqueria, recommended by the Times itinerary. The Times warns of a long wait, but during off-peak hours the line is fine. The ceviche is delicious and there is the option of getting it on cucumber slices rather than with chips.
Tools: New York Times.

7pm.
Jog up Mt. Tabor, recommended by fellow reviewers of your lodging on AirBnB. Mt. Tabor once an active volcano but is now merely a hill. Watch the gentlemen of Portland ride low bikes and skateboards down the hill as the sun sets.
Tools: AirBnB; serendipity.

10pm.
Try in vain for thirty minutes to call a cab from numbers off Google Search. (We still don't know if Mr. Taxi is real.) Default to dining at Sapphire Hotel, recommended by both your AirBnB host and a friend, a former seedy hotel that now probably has one of the better cocktail menus you have ever seen. Enjoy bacon-wrapped figs and perhaps a burger while your friend drinks the "You're not my real dad," a bourbon cocktail that comes with a cigarette.
Tools: word of mouth.

12am.
Take a walk down Hawthorne Street, recommended by AirBnB reviewers as being close to shopping and dining locales of interest. Consider stopping in and playing pool or drinking a beer at one of the bars. Walk past various closed shops and a group of 20-somethings sitting on an awning and drinking. Take the scenic route back along residential streets. Take some time to smell the flowers. Especially if it is summer, they will smell great.
Tools: AirBnB; serendipity.

Saturday.

11am.
Propose going to the Portland Saturday Market, recommended by Graph Search. Wait for Joe to veto this suggestion, saying that it is full of people from the suburbs. Have Joe instead take you for Vietnamese food at Ha VL in Southeast Portland. Be impressed by the fact that it is in a shopping complex full of Asian restaurants and that the other patrons are largely Asian. Did you know that the Vietnamese have pho for breakfast?
Tools: Graph Search; word of mouth.

1pm.
Take a walk around the Japanese garden, recommended by Graph Search, the Times, and most other travel itineraries for Portland. After you achieve a sense of peace, take in the beauty of the rose gardens, founded in 1917 and the oldest continuous operating rose garden in the United States.
Tools: Graph Search; New York Times; word of mouth.

3pm.
Get all natural, hand-crafted ice cream at Ruby Jewel, recommended by Joe. If you have enough appetite, try the salted caramel apple pie a la mode.
Tools: word of mouth.

4pm.
Do some shopping downtown at stores you serendipitously discover by wandering around. Tanner, Polar, and Yo! Vintage are all on the same block.
Tools: serendipity.

5pm.
Wander around Powell's Books, discovered by Graph Search as a highly recommended destination. The largest independent new and used bookstore in the world, Powell’s takes up a city block and has multiple sections, including ones for comics and rare books.
Tools: Graph Search.

7pm.
Discover Cacao Drink Chocolate while walking and take a hot chocolate break. Sample the hot chocolate espresso shot or a cup of melted single-origin chocolate.
Tools: serendipity.

8pm.
Dine on French cuisine at the Little Bird Bistro, recommended by the NYT as the more accessible alternative to the popular Le Pigeon, flagship of chef Gabriel Rucker.  Spend a leisurely couple of hours enjoying the food and cocktails.
Tools: New York Times.

11pm.
Imbibe local beers at  Eastburn, recommended by Joe for its proximity to Little Bird. Enjoy the wide variety selection of beers on tap, perhaps on the patio.
Tools: word of mouth; serendipity.

Sunday.

11am.

Have brunch at Woodsman Tavern, which you discovered on your way back from Broder the other day that a local called her favorite restaurant in Portland. On your way out, browse the snacks and sodas at the adjoining store.
Tools: serendipity; word of mouth.

2pm.
Ditch the original plan of going to Voodoo Donut, discovered via Graph Search, after a local tells you that it is "touristy" and a "last result." After failing to hunt down the donut truck that is supposedly the best source of donuts, pick up a snack at Blue Star Donuts before they run out for the day. (This usually happens in the early afternoon!)
Tools: Graph Search; word of mouth.

3pm.
Wander around the boutiques of East Burnside. There is Redux, recommended by the Times as an analog Etsy housing the work of over 300 artists. There are also fun surrounding vintage shops. In the way of designer boutiques there is Machus for mens's high fashion and Lille for lingerie.
Tools: New York Times; serendipity.

5pm.
Continue exploring downtown Portland. Walk around Pioneer Courthouse Square, recommended by Graph Search. Peer into some boutiques you passed by earlier but did not enter, such as Frances May.
Tools: Graph Search; serendipity.

7pm.
Dinner at Biwa, a homestyle Japanese restaurant recommended by the Times. Enjoy the yakitori (grilled chicken), handmade ramen and udon, rice balls, and sake.
Tools: New York Times.

Discussion.
It turned out that while Graph Search provided a nice initial preview of what Portland would be like, the New York Times and locals suggested higher-quality activities: those that were more highly reviewed on internet sites such as Yelp and by other locals. Serendipity is also a useful tool: finding one thing you like can help you find other similar things, either by walking around the neighborhood or by asking people there for suggestions. For all technology can do to provide new ways for people to interact, for purpose of travel it would do well to start by replicating these processes of consulting experts and exploring clusters of similar options. Since we already have sites like Yelp and TripAdvisor to allow people to do this with strangers, it will be interesting to see how Facebook Graph Search can allow us to leverage the social graph to improve upon this.