Digital Labour and Development

Pasig City

The picture above was taken in Pasig City in the Philippines. The poster advertising free wifi is symbolic of the changing connectivities of a country in which more than 30 million people are now Internet users. Whereas the advert on the the left is symbolic of how many in the country have harnessed those new connectivities: setting up business process outsourcing (BPO) firms and performing digital work.

This, however, is a relatively old story and there are millions of people around the world working in the outsourcing sector.

But, in the last few years, we have seen some important changes. The rapid growth of online freelancing, digital work, and microwork is undoubtedly changing the landscape of digital work: creating jobs in people’s homes and internet cafes rather than in the kinds of offices full of BPO firms in the photograph above.

These changes could be seen as an important moment in the trajectory of global development: offering millions of skilled and unskilled workers in low-income countries access to jobs. But many concerns also exist. Not only are workers placed in potentially precarious positions, they also are potentially enrolled into new digital sweatshops with little opportunity to upgrade their positions.

It is in the context of those very different ways of understanding the intersections between digital labour and development, that my colleagues Helena Barnard, Vili Lehdonvirta, Isis Hjorth, and myself are embarking on a 30-month project to understand contemporary virtual production networks.

We are focusing on three countries in Southeast Asia and three in Subsaharan Africa, asking the following questions:

  • What is the overall landscape of virtual production networks in Sub-Saharan Africa and Southeast Asia?
  • What factors explain the network structures that we see?
  • How are these networks changing over time?
  • Who benefits from SSA’s and SEA’s virtual production networks?
  • How do observed changes differ from public, political, and academic discourses surrounding potential effects?

We are using a combination of quantitative (using log data from work platforms) and qualitative (six months of fieldwork) methods and plan to regularly release and share our findings.

Changing connectivities are undoubtedly profoundly influencing the landscape of digital work: enabling new flows, new networks, and new geographies. By studying virtual production networks in some of the worlds economic peripheries, we hope to ultimately understand who benefits and who doesn’t from these new forms of work.


Connectivity and Tourism in Rwanda

This is the second part of a three part series that discusses the outcomes of recent outreach meetings in Rwanda for our project on changing internet connectivity in East Africa (part 1).

In this article Chris Foster (OII) and Claude Migisha K (ICT4D consultant) discuss the outcomes of an outreach meeting in the tourism sector.

Broadband and ICTs can contribute to economic growth and help improve delivery in many sectors. Tourism is a growing service sector in Rwanda, frequently targeting foreign customers, so we wanted to explore how relevant new connectivity has been to firms positioning in the global market.

Globally, ICTs and the internet are transforming tourism, and in Rwanda virtually all firms in the tourism sector use ICT in some form. However, successful use of specific technologies were often only undertaken by one of two firms in the sector. This workshop was set up as an interactive session to allow some of these successful practices to be shared, which could lead to wider improvement.

The workshop was organised in hand with the Rwanda Tourism Chamber – an umbrella association for all private companies in tourism-related business and the Rwanda Development Board (RDB) – Tourism department. The session was introduced by Telesphore Ngoga, Community Conservation and Development Manager, RDB and attended by hotel managers, tour operators, travel agents, tourism destination managers and policy makers.

1) Using internet to improve operations

The tourism sector in Rwanda includes a range of firms (see below) and activity often involves firms bundling tourism services, and organisation of these services. In the meeting we discussed how we found that the internet is being used to help firms improve their organisation and internal planning, reducing time spent on management and internal communications.


Firms linked to the Rwandan tourism sector.
Firms towards the right of the image may bundle together services towards the left for tourists

However, online interaction is frequently email-based with lower use of information systems and online services amongst smaller firms.

We did find interesting use of ICT apps and online services, but this was not widespread. Interesting use of apps and services we found included some Rwandan tour operators using Dropbox to share high bandwidth multimedia with international tour firms, improving their presentation of tour itineraries. Some hotels and guest houses were also adopting online booking systems such as Expedia, Hotels.com and Airbnb which simplify payment from customers.


A growing number of hotels are using online services such as Expedia to increase bookings. They also allow more simplified payments from customers improving operations.

There is still work to do though. For instance, we identified coherent internal information systems and skilled management as crucial to more dynamic firms in the tourism sector. In many firms a lack of internal systems and clear management meant that connectivity was not being best exploited

2) Using internet to reach customers

Websites and social media

There is a tendency for tourism firms in Rwanda to outsource online websites and social media to external web firms, for which they are often overcharged. In this meeting, successful firms stressed that it is crucial that content writing and updating are kept in-house to ensure that firms keep control of online resources.

A number of questions in the workshop were also posed around how best to manage and promote websites (and ultimately attract customers). For instance, new online platforms and using SEO (Search Engine Optimization) to attract more clients were two areas mentioned as areas of low knowledge.

successful firms stressed that it is crucial that content writing and updating are kept in-house to ensure that firms keep control of online resources.

Social media use in Rwandan firms was also an area of discussion. In our research we found that sometimes social media is seen by managers as a time-sink with unclear benefits. Yet, it is an increasingly important online resource – in how customers find, share and decide on tourism experiences. For those directly involved in social media, key issues discussed in the meeting included how to deal with bad reviews and what types of information are best to present on social media.

Markets and branding

Going online is often not a matter of reaching ‘more of the same’ customers. Firms in Rwanda who have been successful have been those who have reached specific demographics or targeted segments of customer online.


Some tour firms in Rwanda have undertaken online market segmentation strategies, using variable online branding to reach different customer demographics

There were successful cases of firms who strategically push into niche areas (i.e bird watching, community tourism, and regional customers) and were successful. In discussions it was felt that online activities are a key element of marketing and branding approaches – in looking for niches, and ensuring that niches reach a critical mass of customers.


In conclusion, our key research findings in the wider project were echoed in this meeting. Many firms in tourism have adopted and are actively using digital connectivity. But, digital connectivity alone has not lead to transformation. Rather, one can see a set of wider barriers to transformation.

digital connectivity alone has not lead to transformation. Rather, one can see a set of wider barriers

Barriers can relate to skills in using available technologies within businesses. Online resources and services were also found to be difficult to integrate effectively and this can limit viability. Finally, existing firms may already be in well-established relationships with international tour firms which can make it difficult for new firms to grow to significant scale.

Some firms have resources to overcome barriers and feel the transformational benefits of connectivity, but many others are still searching for best use of connectivity. The goal should be to tackle these barriers to effective use of connectivity to drive improved benefits.

We released a short summary report on Rwandan Tourism sector and connectivity as part of this meeting which is available here. We will also be releasing a comprehensive report in late October that summarises this research.

We would like to thank those who attended the session and contributed to the lively discussions. We would also like to thank the Rwandan Tourism Chambers and RDB for their support in hosting this session.



Introducing GEONET: studying Sub-Saharan Africa’s knowledge economies

I’m happy to announce the launch of the new GEONET project: studying ‘Changing Connectivities and the Potentials of Sub-Saharan Africa’s Knowledge Economy.’

This five-year project, funded by an ERC Starting Grant, aims to understand the difference that changing connectivities are having on Sub-Saharan Africa’s emerging information economies.

For a full introduction of the project, and associated team members, please head over to the Geonet site to take a look: geonet.oii.ox.ac.uk

We have a great group of researchers assembled, and I’m looking forward to seeing what we can accomplish over the next few years.



ICT, Connectivity and Rwandan Agriculture

This is the first of a three part series that discusses the outcomes of recent outreach meetings in Rwanda for our project on changing internet connectivity in East Africa.

In this article Chris Foster (OII) and Claude Migisha K (ICT4D consultant) discuss the outcomes of the outreach meeting in the tea sector


Agriculture is the backbone for many African countries economies. The East Africa region has a population of approximately 135 million with more than 50% of the population living in rural areas undertaking farming and agriculture as their main source of revenue.

kLab Main LogoWe wanted to explore how our research on the internet/ICT use in East African tea production could provide insight into improving agriculture. Specifically, we felt this research could provide in-depth knowledge on information flows (and lack of) that could be useful for technology developers involved in ICT4Ag (ICT for Agriculture). With this goal in mind, the outreach meeting took place in Kigali’s kLab innovation hub, who have been supporting the ICT4Ag agenda in Rwanda.

Present at the workshop were ICT solutions providers, prospective ICT4Ag developers, farmers, and brokers.

Outline of discussions

In this research on the tea sector we took a ‘value chain’ approach. This explores the relationships between firms involved in production of tea (from farmer to retailer) and analyses the ‘value’ that each ‘actor’ in the value chain is able to extract from their involvement in production.

The value chain approach was particularly useful in that it provided insight around information flows in two areas: it highlighted under-considered actors in the value chain; and it revealed post-processing and marketing processes which are often underplayed in ICT discussions.


The tea value chain for Rwandan tea production, based on our research (click for full size)

Firstly, by mapping production in tea in a systematic way, we were able to highlight important elements of agriculture and primary production which have been under-considered by ICT developers and ICT solutions.

For example, in the tea sector, co-operative associations play a vital role in supporting smallholder farmers, they are also likely to be a source of more innovative technology use. Yet, ICT developers and ICT solutions rarely consider them. Indeed in the Rwandan tea sector new ICT solutions, such as automated field weighing technologies, could potentially marginalise co-operatives.

We discussed such actors with meeting delegates who felt that the same under-consideration also applied in other sectors of agriculture. There may be other farmer-supporting ‘intermediaries’ in these sectors, such as government extension workers and NGOs who are also important, but they are rarely the focus of new ICT4Ag solutions.


Co-operative associations in the tea sector play an important role in information provision and support for low income smallholder farmers, yet their key role is rarely supported by ICT4Ag solutions.

Secondly, mapping value chains also highlighted new opportunities in post-processing and marketing in agriculture which have hardly been considered by ICT developers.

Examples from the tea sector included evidence that better provision of global tea prices and market intelligence in tea (i.e. competition, end-market analysis) would be a valuable service for tea firms and co-ops, something poorly provided for at present.

The growth in the importance of quality and standards as part of post-processing and marketing also potentially offers new opportunities, where information flows are rarely digitised.

In the outreach meeting, discussions suggested that much of the focus in agriculture has been on farmers involved in production for local markets, where export-orientated sectors such as tea, coffee and horticulture might be more closely considered in the future.


Kinyarwanda guidebook for RFA (Rainforest Alliance) certification. Information flows around standards is still decidedly analogue, but this is likely to change in the future

These examples give a very brief flavour of some of our discussions in this meeting, and give an insight into how we identified and shared our research.

For those interested, we have an executive summary of our findings and we will be releasing a full report on the Rwandan tea sector in a few weeks.

How do we make ICT4Ag research relevant to developers?

Some of the most interesting discussions we had actually happened before the meeting in informal discussions with developers in the kLab. For these developers, academic research was seen as having little relevance to their everyday work of developing technology solutions. Research results are often presented in obscure ways with complex theories. So, beyond our research findings we wanted to also discuss the question– what is the point of research for ICT developers?

1)  New ways of thinking about a sector – New ways of thinking can highlight new opportunities. For example, our ‘value chain’ approach in the tea sector can highlight surprising outcomes that ICT4Ag developers might look to tackle.

2)   Moving beyond speculation -  It is often said that the first step to creating a viable technology solution is to ‘know your problem’ and research such as our work in tea (based on over 100 interviews  – from farmer co-operatives to some of the worlds’ largest multi-national tea firms) is a valuable resource to be able to understand the activities going on in agriculture.

Moreover, in-depth research can often question conventional knowledge. For instance, our research in tea questions the idea that growing internet use will remove ‘inefficient’ middlemen. So, research can provide developers with new directions and clear knowledge

3)   Beyond the generic ICT4Ag solution – As Chris has outlined before in discussing our research, one thing we have found in ICT4Ag is that developers may be pulled toward developing quite generic solutions. Often ICT4Ag solutions revolve around providing information on market prices for farmers or systems which improve access to markets.

In some cases these types of solution can be useful, but one needs to make close consideration to contexts and needs. For instance in our tea research, market access systems have little potential where farmers are already part of global markets for tea (selling via the Mombasa tea auction). ICT4Ag solutions in such cases require more creative and evidence-based solutions if they are to have value.

What can we say about connectivity in agriculture?

In outreach meeting, participants concurred with the fact that there are opportunities for the development of ICTs that improve flows of information and knowledge to tea farmers and cooperatives. Potential opportunities include information provision in terms of sharing agricultural research (fertiliser types, bushes), pest and disease control, provision of global market data and better co-ordination for cooperatives.

However, this work also suggested that connectivity (internet and mobile access, appropriate ICT applications) is not the only barrier to efficient agricultural sectors. During the session, farmers mentioned that they still lack skills to be able to identity appropriate ICT tools all along the farming cycle. Crucially there was also discussion about how many of the actors in Rwanda, particularly farmers are still in subservient relationships with global producers.

Whilst connectivity and well-focussed ICT applications can support improved ability and relationships, it may be that ICTs do not overcome more difficult barriers around skills, uneven relationships and power.

We would like to thanks those who attended the session and contributed to the lively discussions. We would also like to thank kLab for their support in hosting this session.

Please see our summary report on Rwandan tea for more details. We will also be releasing a comprehensive report in late October that summarises this research.


Inclusion in the Network Society workshop

Chris Foster and I have just returned from the inspiring meeting on ‘Inclusion in the Network Society’ that was put together by IT for Change in Bangalore, India. 

The meeting brought together a diverse activists and scholars from every corner of the world to critical think through who (and what) increasing digitally-mediated connectivity is actually empowering. The contributions were often heartfelt and inspiring, and grounded in deep domain knowledge and research.   

The final day also led us to attempt to think through what a shared research agenda might look like. We split into four groups and were tasked with attempting to congeal our efforts into only five questions. My group’s efforts are listed below (thanks to Sumandro Chattapadhyay for making sure we noted them all down). This is our first draft, and will be both reworked by the IT for Change into a more coherent form and combined with the questions produced by the three other groups (who were all tackling somewhat different issues)

  • what is [X] in the context of an inclusive network society?
  • who creates, controls, captures, and gains social and economic value in digital networks?
  • what systems and structures, at different scales, constrain or enable communities and individuals living the lives they have reason to value?  What transformations count as emancipatory inclusion? How do we transform systems and structures to achieve those goals? And how do we ultimately work towards something that might look like an inclusive network society?  
  • what are the power structures, configurations, and geographies of voice and representation; and under what institutional conditions do these voices and representations lead to claim-making?
  • what do the institutional landscapes of data regimes look like, who control them and how are they controlled? How can these regimes be made accountable, and under what kinds of ethical frameworks?

The full agenda should be published soon, and many of the papers can already be accessed at IT for Change website (Chris and I have uploaded ours). The organisers will also soon be uploading videos of presentations and subsequent discussions for people who weren’t at the meeting. 


Our paper at the Network Inclusion Roundtable: Geographies of Information Inequality in Sub-Saharan Africa


Chris Foster and I have had the opportunity to participate in the Network Inclusion Roundtable: organised by IT For Change in Bangalore.

Our short paper, titled ‘Geographies of Information Inequality in Sub-Saharan Africa‘ is available at this link.

The paper is a beginning to think about what connectivity means to inclusion in the ‘network society.’ Connectivity certainly isn’t a sufficient condition for inclusion and equity, and we need to ask whether it is a necessary one.

We point to connectivity as an amplifier: one that often reinforces rather than reduces inequality. We therefore need to move towards deeper critical socio-economic interrogations of the barriers or structures that limit activity and reproduce digital inequality. The categorisations developed in the paper offer an empirically-driven and systematic way to understand these barriers in more detail.


What is a tech innovation hub anyway?

Soon-to-launch THINK in Kigali: Incubator? Hub? Both?

Soon-to-launch THINK in Kigali: Incubator? Hub? Both?

Innovation and entrepreneurship “hubs” and “labs” are all the rage these days. A wide range of actors is convinced that hubs represent a genuinely new and exciting model for supporting (tech) entrepreneurs, in particular in Sub-Saharan Africa, which is the focus of my research.

Here is a snapshot of publications, just from the last two years, that have tried to define, assess, and take stock of the phenomenon:

While this list illustrates that innovation hubs are popular, little analysis has been done on why that is. One obstacle is that many, if not most, of the discussions around hubs use the term quite loosely. For one, “hub” has several connotations that our zeitgeist sees as desirable (such as open and egalitarian interaction, collaboration, or grassroots), and it appears to me that the word is often used as a brand more than a meaningful descriptor.

So, what is a (tech innovation) hub anyway? Is it just a trend term replacing “incubators”, “R&D labs”, “science parks”, “technopoles”, or “training facilities” that have recently fallen from grace? Is there anything special and new about organizations like iHub or the Impact Hub?

A few weeks ago, I joined a group of researchers[1] gathered by Tuukka Toivonen to discuss these questions. All of us had done empirical research on hubs, so we had an intuitive understanding of the concept. The group had also examined hubs with different goals, located in geographical and cultural contexts spanning Africa, Europe, and East Asia, which gave us a good range for comparisons. We were clear that hubs shouldn’t be reduced just to the hub space, and that they are instead a particular type of organization.

Word cloud of my workshop notes

Word cloud of my workshop notes

Yet, we soon got lost in a jungle of buzz words and vague paradigms, and we found it surprisingly difficult to pin down the uniqueness of hubs as a new organizational form (if it is one) with conceptual precision. We also noticed that the ideals that hubs aspire to are often quite different from the more mundane realities of life and work inside of hubs.

So we decided to derive a hub definition based on the stereotypical ideal of a hub, which could then serve to distinguish hubs from other organizations based on their vision and mission. Only at a later stage, we wanted to compare ambition, reality, and actual “impact”.

Here are the attributes of an idealized hub that we came up with:[2]

  1. Communal

Hubs heavily emphasize that they are merely a meeting and convening point for a community, and that without this community, they would be nothing. A hub community is not just any group of people. Members of the community share a certain identity and have a sense of belonging and/or participation. This often translates into a higher mental activation (inspiration, motivation) around whatever is the common cause of the hub.

  1. Self-organizing and adaptive

The community idea is also at the heart of another defining feature of hubs: their self-organizing and adaptive nature. Hubs cannot be set up in a top-down manner; they always emerge from the “grassroots” initiative of innovators and entrepreneurs. While hubs are more stable and continuous than event series like Barcamps, Startup Weekends, conferences, or innovation competitions, they also constantly adapt to changing community needs. Accordingly, hub managers usually see themselves less as leaders and more as facilitators. While donors and sponsors are usually needed to fund hubs, they are only seen as supporters that are not allowed to impose an agenda that would not be in line with the needs that the community articulates. This is in direct accordance with principles of the Startup Community movement. However, constant adaptation often does not jive well with institutional frameworks of funders that are based on pre-specified accountability, long-term planning, and targets, in particular in the context of monitoring and evaluation mandated by development organizations.[3]

  1. Instead of innovating, enabling innovators

Implicit in the previous points is another hub attribute that is noteworthy because it is often forgotten in discussions on the “impact” and effects of hubs: hubs are not themselves creators or implementers of innovations (or projects, startups, apps, etc.). Instead, hubs see themselves as enablers of innovators and entrepreneurs, or, even more broadly, doers of some sort. Hubs can be more or less selective and stringent in terms of which doers they support, but in the end entrepreneurs are seen to be the ones with the real-world impact, while the hub just enables them. As one can imagine, this makes attributing and quantifying the impact that the hub itself has very challenging, especially if the hub offers a range of membership tiers and a variety of more or less hands-on programs. This also means that expectations towards a hub’s impact can hardly be codified as pre-specified targets (such as “number of startups launched”), and instead evaluations need to trace indirect and unexpected causal pathways of impact that result from the enabling-the-doers setup.

  1. Heterogeneous knowledge, serendipitously combined

Maybe the most interesting feature of hubs is that they aim to convene like-minded individuals while at the same time bringing together people with different backgrounds and knowledge. For instance, to stimulate software and mobile app innovations, hubs usually aim to gather techies and coders, but also bring in business people and investors. At the core is the idea that startups need complementary inputs (e.g., creative product design and financing), but also that innovation inherently relies on new and unlikely combinations of existing knowledge. Hubs build on the notion that ideation and creativity can neither be pre-specified nor coerced, and so they aim to create a structure in which individuals serendipitously interact with others that they would not typically meet. This is similar to the argument that “thinking in silos” inhibits creativity, and so hubs invite people to step out of their regular work routines and openly interact with new contacts until a “happy accident” happens. Such a setup also relies on a non-hierarchical and open relationship structure between community members: everyone is encouraged to engage with everyone else. What exactly are the right combinations of like-mindedness and heterogeneity is a difficult question, and to us it seemed as one of the most interesting lines of inquiry for research on hubs.

  1. Local outposts of a higher cause

Another intriguing facet of hubs is that they emphasize adaptation to local context, but at the same time tend to frame themselves as part of a global movement. The “global entrepreneurship movement / revolution” often serves as thSOAS workshop notese overarching value and belief system, and the Lean Startup and Business Model Canvas are examples of more concrete shared understandings of tech entrepreneurs. In fact, co-working itself is increasingly seen as a global movement, and it has started to yield insights and templates for the design of a hub space (which might explain why hub spaces look so alike across vastly different geographies and cultures). Importantly, hubs become local representations of globally homogenous understandings (“movement”), but in the local context, where these understandings are unique and new, the hub can actually host a subculture (“revolution”) compared to incumbent and prevalent organizational designs and ways of doing things.


Now, I want to be clear that an organization does not necessarily have to meet all of these attributes to qualify as a hub. Instead, this list is meant to describe the idealized and stereotypical hub concept. Hubs usually only emphasize a few of these attributes, or they might aspire to live up to each one but in reality cannot meet all of them. Also financial sustainability plays a major role: hubs often have to make compromises and budge to funders’ and sponsors’ agendas to keep their ship afloat.

Yet, I do think that the above description helps us to distinguish hubs, for instance, from organizational models that only provide immediate business and resource support without a communal element, like traditional incubators. We also now have a better basis to discuss what is new about hubs, and how we could tackle questions about their effectiveness and role in innovation (eco)systems. My hope is that we can build on this start of a conceptual understanding to improve the quality of our discussions, and stop comparing apples to oranges.

Still, an actual hub will almost never fall neatly into the outlined concept. In fact, incubators, accelerators, and even science parks have started to borrow elements from the hub concept, resulting in mashups of organizational models. Another trend is the co-location of pure hub models with more structured support programs, like inside and near Bishop Magua Centre in Nairobi.[4]

In upcoming blog posts, I will outline ideas for theoretical perspectives and empirical research, as well as basic hub categories and preliminary findings on tradeoffs and funding structures from my previous and ongoing research. Please get in touch with any feedback and comments.


Tuukka Toivonen provided comments and feedback to this post. My research is funded by the Clarendon Fund and the Skoll Centre for Social Entrepreneurship at the University of Oxford.


[1] The workshop participants can be found on twitter @Tuukka_T @KindaAS @williamhan @Andrejcisneros @TimsWeiss @queaky @IrinaVPopova, and LinkedIn http://lu.linkedin.com/in/lgryszkiewicz

[2] In an interview that came out shortly after our workshop, Erik Hersman, co-founder of the iHub in Nairobi, reflected on the iHub and its unique features. Interestingly, his points are very similar to the attributes that we derived based on a much wider sample of hubs.

[3] This actually mirrors a broader problem and debate about incentive-setting and results measurement in international development projects, for instance, discussed by Robert Chambers here.

[4] In fact, people have started to call the building itself an “ecosystem”, for example, the iHub community manager.


Off to Explore the Inner Workings of African Tech Innovation Networks

Tech innovation hubs like kLab in Rwanda have been established across Africa

Tech innovation hubs like kLab in Rwanda have been established across Africa

Even if you are just a casual follower of technology in developing countries, you will probably by now have come across blog posts and news articles touting Africa’s tech entrepreneurship boom.[1] Indeed, the first fast-growing mobile app startups have come up, the first Pan-African startup innovation platforms and conventions have been assembled, and thousands of aspiring technologists and would-be entrepreneurs across the continent are now looking to solve problems and build companies with technology. Along with the buzz, a consensus has emerged that local digital production—such as African startups targeting mobile applications at businesses or consumers in their home country or region—could and should be an important contribution to economic and social development.

At the same time, right underneath the feel-good patina of mainstream and donor media that is happy to report African success stories, there are also critical voices and emerging debates. Many of the arguments revolve around the risks and benefits of supporting local tech entrepreneurs, and how to use the scarce available resources.[2] In this context, the rise of a new type of organization, the tech innovation hub, has caught people’s attention.[3] However, it has proven extremely tricky to identify the desired and actual impact of these systemic innovation intermediaries (Smits & Kuhlmann, 2004), and so the few available assessments range all the way from questioning to excitement to disillusionment.

In short, there is clearly a lot of confusion around how to support tech entrepreneurship and early stage innovation across Africa, and no definitive models have emerged. Stakeholders of local innovation systems are still grappling with a long list of questions concerning the “if”, “who”, “how”, “when”, and “where” of tech entrepreneurship support. In particular, it is unclear whether hubs are effective as innovation brokers (Klerkx & Leeuwis, 2009).

In my dissertation research that goes beyond my work at infoDev, I want to address some of these questions. Based on my initial reading of the available literature and evidence, I believe that “innovation networks” (that is, the relations and interactions between entrepreneurs and other actor groups in innovation systems) are a key part to the puzzle, and so I will apply qualitative and quantitative social network analysis as an analytical method.[4]

From September to December 2014, I will kick off the data collection and spend one month each in Kigali, Harare, and Accra. With this study setup, I hope to capture tech innovation happening (or not happening) in contexts that differ in terms of factors such as geography, economic development, entrepreneurship mentality and “culture”, tech innovation legacy, “vibrancy” and the number of already present actors in the innovation system, and many other influences.

I’m hoping that the result will be a better understanding of the dynamics underlying African tech innovation systems. Ultimately, my research is meant to inform and shape the policy and decision-making of people engaging in questions around tech entrepreneurship and local digital production in cities all over the continent.

I invite you to comment and contact me if you would like to be involved in this research, or simply be kept posted about my findings. Do reach out especially if you are a stakeholder of the tech innovation systems of Kigali, Harare, and Accra. Academics interested in this line of research should also take a look at the AAG call that Mark Graham, Isis Hjorth, and I recently put out.



Hekkert, M. P., Suurs, R. A. A., Negro, S. O., Kuhlmann, S., & Smits, R. E. H. M. (2007). Functions of innovation systems: A new approach for analysing technological change. Technological Forecasting and Social Change, 74(4), 413–432.

Klerkx, L., & Leeuwis, C. (2009). Establishment and embedding of innovation brokers at different innovation system levels: Insights from the Dutch agricultural sector. Technological Forecasting and Social Change, 76(6), 849–860.

Smits, R., & Kuhlmann, S. (2004). The rise of systemic instruments in innovation policy. International Journal of Foresight and Innovation Policy, 1(1-2), 4–32.


[1] Here is a sample of articles in major media outlets: New York Times, Huffington Post, The Guardian, NPR, The Economist, Tech Crunch, and Wired UK.

[2] A nerve was struck by a Wired UK article that resulted in a chain of at times combative blog posts by Tom Jackson, Sam Gichuru, Mbwana Alliy, Josiah Mugambi, Erik Hersman, and Jon Stever (as collected by Erik and Jon in their blog posts). A more recent debate was started by Dan Evans and followed by answers from TMS Ruge and Jon Gosier. Also the comments section of Tim Kelly’s widely noted blog post brings up some interesting issues.

[3] At the risk of missing others, a few of the better known addresses on tech innovation hubs are: Afrilabs, VC4Africa’s tag on Hubs, Bongohive’s Hubs Map, the iHub blog, and Afrihive.

[4] That said, I will complement the network analyses with broader, qualitative innovation system assessments based on Hekkert et al.’s (2007) “functions” perspective.


Diary of an internet geography project #4

Screen Shot 2014-08-05 at 1.31.00 PMContinuing with our series of blog posts exposing the workings behind a multidisciplinary big data project, we talk this week about the process of moving between small data and big data analyses. Last week, we did a group deep dive into our data. Extending the metaphor: Shilad caught the fish and dumped them on the boat for us to sort through. We wanted to know whether our method of collecting and determining the origins of the fish was working by looking at a bunch of randomly selected fish up close. Working out how we would do the sorting was the biggest challenge. Some of us liked really strict rules about how we were identifying the fish. ‘Small’ wasn’t a good enough description; better would be that small = 10-15cm diameter after a maximum of 30 minutes out of the water. Through this process we learned a few lessons about how to do this close-looking as a team. 

Step 1: Randomly selecting items from the corpus

We wanted to know two things about the data that we were selecting through this ‘small data’ analysis: Q1) Were we getting every citation in the article or were we missing/duplicating any? Q2) What was the best way to determine the location of the source?

Shilad used the WikiBrain software library he developed with Brent to identify all roughly one million geo-tagged Wikipedia articles. He then collected all external URLs (about 2.9 million unique URLs) appearing within those articles and used this data to create two samples for coding tasks. He sampled about 50 geotagged articles (to answer Q1) and selected a few hundred random URLs cited within particular articles (to answer Q2).

  • Batch 1 for Q1: 50 documents each containing an article title, url, list of citations, empty list of ‘missing citations’
  • Batch 2 for Q2: Spreadsheet of 500 random citations occurring in 500 random geotagged articles.

Example from batch 1:

Coding for Montesquiu

  1. Visit the page at Montesquiu
  2. Enter your initials in the ‘coder’ section
  3. Look at the list of extracted links below in the ‘Correct sources’ section
  4. Add a short description of each missed source to the ‘Missed sources’ section

Initials of person who coded this:

Correct sources

Missing sources

Example from batch 2:

url domain effective domain article article url
books.google.ca google.ca Teatro Calderón (Valladolid) http://en.wikipedia.org/

For batch 1, we looked up each article and made sure that the algorithm we were using was catching all the citations. We found that there were a few anomalies where there was a duplication of citations (for example, when a single citation contained two urls: one to the ISBN address and another to a Google books url) or when we were missing citations (when the API was only listing a URL once when it had been used multiple times or when a book was cited without a url, for example) or when we were getting incorrect citations (when the citation url pointed to the Italian National Institute of Statistics (Istat) article on Wikipedia rather than the Istat domain).

The town of El Bayad in Libya contained two citations that weren’t included in the analysis because they didn’t contain a url, for example. One appears to be a newspaper and the other a book, but I couldn’t find the citations online. These would not be included in the analysis but it was the only example like this:

  • Amraja M. el Khajkhaj, “Noumou al Mudon as Sagheera fi Libia”, Dar as Saqia, Benghazi-2008, p.120.
  • Al Ain newspaper, Sep. 26; 2011, no. 20, Dar al Faris al Arabi, p.7.

We listed each of these anomalies in order to work out a) whether we can accommodate them in the algorithm or whether b) there are so few of them that they probably won’t affect the analysis too heavily.

Step 2: Developing a codebook and initial coding

I took the list of 500 random citations in batch 2 and went through each one to develop a new list of 100 working URLs and a codebook to help the others code the same list. I discarded 24 dead links and developed a working definition for each code in the codebook.

The biggest challenge when trying to locate citations in Wikipedia is whether to define the location according to the domain that is being pointed to, or whether one should find the original source. Google books urls are the most common form of this challenge. If a book is cited and the url points to its Google books location, do we cite the source as coming from Google or from the original publisher of the work?

My initial thought was to define URL location instead of original location — mostly because it seemed like the easiest way to scale up the analysis after this initial hand coding. But after discussing it, I really appreciated when Brent said, ‘Let’s just start this phase by avoiding thinking like computer scientists and code how we need to code without thinking about the algorithm.’ Instead, we tried to use this process as a way to develop a number of different ways of accurately locating sources and to see whether there were any major differences afterwards. Instead of using just one field for location, we developed three coding categories.

Source country:

Country where the article’s subject is located | Country of the original publisher | Country of the URL publisher

We’ll compare these three to the:

Country of the administrative contact for the URL’s domain

that Shilad and Dave are working on extracting automatically.

When I first started doing the coding, I was really interested in looking at other aspects of the data such as what kinds of articles are being captured by the geotagged list, as well as what types of sources are being employed. So I created two new codes: ‘source type’ and ‘article subject’. I defined the article subject as: ‘The subject/category of the article referred to in the title or opening sentence e.g. ‘Humpety is a village in West Sussex, England’ (subject: village)’. I defined source type as ‘the type of site/document etc that *best* describes the source e.g. if the url points to a list of statistics but it’s contained within a newspaper site, it should be classified as ‘statistics’ rather than ’newspaper’.

Coding categories based on example item above from batch 2:

subject subject country original publisher location URL publisher location language source type
building Spain Spain US Spanish book

In our previous project we divided up the ‘source type’ into many facets. These included the medium (e.g. website, book etc) and the format (statistics, news etc). But this can get very complicated very fast because there are a host of websites that do not fall easily into these categories. A url pointing to a news report by a blogger on a newspaper’s website, for example, or a link to a list of hyperlinks that download as spreadsheets on a government website. This is why I chose to use the ‘best guess’ for the type of source because choosing one category ends up being much easier than the faceted coding that we did in the previous project.

The problem was that this wasn’t a very conclusive definition and would not result in consistent coding. It is particularly problematic because we are doing this project iteratively and we want to try to get as much data as possible so that we have it if we need it later on. After much to-ing and fro-ing, we decided to go back to our research questions and focus on those. The most important thing that we needed to work out was how we were locating sources, and whether the data changed significantly depending on what definition we used. So we decided not to focus on the article type and source type for now, choosing instead to look at the three ways of coding location of sources so that we could compare them to the automated list that we develop.

This has been the hardest part of the project so far, I think. We went backwards and forwards a lot about how we might want to code this second set of randomly sampled citations. What definition of ‘source’ and ‘source location’ should we use? How do we balance the need to find the most accurate way to catch all outliers and a way that we could abstract into an algorithm that would enable us to scale up the study to look at all citations? It was a really useful exercise, though, and we have a few learnings from it.

- When you first look at the data, make sure you all do a small data analysis using a random sample;

- When you do the small data analysis, make sure you suspend your computer scientist view of the world and try to think about what is the most accurate way of coding this data from multiple facets and perspectives;

- After you’ve done this multiple analysis, you can then work out how you might develop abstract rules to accommodate the nuances in the data and/or to do a further round of coding to get a ‘ground truth’ dataset.

In this series of blog posts, a team of computer and social scientists including Heather Ford, Mark Graham, Brent Hecht, Dave Musicant and Shilad Sen are documenting the process by which a group of computer and social scientists are working together on a project to understand the geography of Wikipedia citations. Our aim is not only to better understand how far Wikipedia has come to representing ‘the sum of all human knowledge’ but to do so in a way that lays bare the processes by which ‘big data’ is selected and visualized.