Tuesday, January 31, 2012

#LAK12 Model-building in EDM

Ryan Baker's presentation on educational data mining today as part of the Learning Analytics and Knowledge MOOC helped me to better understand the model development concept in EDM. One of his slides listed several types of learner behaviors for which he developed models. When learners interacted with educational software, he was able to describe certain learner behavioral patterns (number of clicks, wait time between clicks, order of clicks, etc.) in such a way that if the data log files were analyzed for any learner, those behaviors could be spotted. He could identify when a learner was exhibiting behaviors such as:Miners' Memorial
Photo by Tim Duckett (tim_d)


  • carelessness

  • off-task activity

  • gaming the system

  • avoiding help when they needed it

  • not asking for help because they didn't need it

  • guessing

I'd be very interested in hearing more about the model development step, and how the researchers constructed meaning from a pattern of clicks.


There are two observations that I took away from the presentation.


1) Model-building is inherently value-laden. Baker alluded to this a bit when he mentioned that learner behaviors that were observed in one of his modelling tests were vastly different in the Phillipines as compared to the United States. Learner behaviors are conditioned by culture and situated in a culture. As educators, we too are the product of our culture and cannot avoid building our assumptions into every tool and system that we develop. Since these data models are used to classify learners, model builders need to approach their task with the utmost humility and care.


2) The educational data mining approach seems to be an example of behaviorism. There was much discussion about the observable behaviors of the learners but not much about the learners' internal thoughts and feelings. Baker did allude a bit to thoughts and feelings of U.S. and Phillipines students when they interacted with his Scooter software. For me, that was the most interesting part of the presentation.

Monday, January 30, 2012

#LAK12 No Shadow Databases!









Respondents to the ECAR study by Goldstein said that they have not had a lot of success in reducing the presence of shadow databases. Shadow databases are "silo" information systems maintained by separate functional units in the organization (or even within the same unit) which contain the same types of information, or perform similar functions. The problem is, usually they do NOT contain the same information. Shadow databases proliferate when there is weak oversight of the institution at the highest levels of administration, when little communication exists between units, and when individuals are rewarded not for sharing information but for hoarding it. A perfect description of many institutions in the world of higher education.




The problem with shadow databases is that they represent wasted time, effort, and money. Not just by wasting the opportunity to learn from others' information, but sometimes even in the money put toward purchasing redundant systems. But they do serve an important sociological function in the workplace: preserving the autonomy of a unit within a large organizational structure.




Institutions with "performance dashboards" as part of their analytics technology report the best outcomes for eliminating shadow databases. No doubt this is because the dashboard tools are very attractive analysis tools for almost every organizational unit, and this makes units want to opt in. Just as likely, however, is that these systems suddenly make it glaringly obvious whose information is missing, thereby providing some pressure from peers and higher-ups to participate.


Eliminating shadow databases is indeed a business improvement that the faint-hearted should not attempt.

#LAK12 examples of 5 levels of sophistication




Goldstein's Academic Analytics: The Uses of Management Information and Technology in Higher Education (2005) mentions "five levels of sophistication of use" of academic analytics.





1) transaction only: i.e. "How many students logged in to Course Y in the last 24 hours?" You would run a query on the learning management system but it would only happen on command.

2) "analysis and monitoring of operational performance": An illustration of might be: "What is the trend in Mediasite (lecture capture) usage throughout the semester?" This specific question is especially appropriate if your resource is limited to a certain number of concurrent users, for example. You can do this analysis with the previous level if you download into Excel, create graphs, etc. but it is easier if your data tool provides an automatic visualization.

3) "What-if decision support, such as scenario building": I can't even think of an example of this. Please comment and give me an example if you know of one. I'll give the next one a try...

4) "Predictive modeling and simulation": An illustration of this might be "How should classroom scheduling practices have to change 4 years from now if we know that we have X students in the Class of 2015 with certain types of characteristics?"


5) automatically triggering a business process. This would be akin to a Walmart-like situation wherein when a customer buys a stock item with a certain SKU, stock levels of that SKU are checked and compared with rate at which that SKU has been purchased that week and average time to reship from vendor, then depending on those results, an automatic order is sent to the vendor. I think this is what LA aspires to. For example, student doesn't log in to the CMS for a certain number of days and depending on their demographics or the time of the semester, student services or the course instructor sends out an exploratory, "Hey, how's it going? How can we help?" email, which is most likely automated as well.

Note that illustration #4 offered above tend to be more "academic analytics" use, which I interpret as more institution-based, i.e. how can we run this institution more efficiently? Illustrations #1 and 5 are more "learning analytics" focused, which I interpret as learner-based, i.e. How can we help an individual or class of individuals succeed academically?


The article further mentions that central research administration is one of the organizational units least likely to use analytics, which is rather telling. Here's a possible use: strengths of association between academic departments or individual research faculty are analyzed through bibliographic citation data, or in subject tags for their publications (like MeSH headings, for those of you who use PubMed). Individuals who are found to be highly associated in terms of the domains in which they work could be potential collaborators, joint/bulk purchasers, etc. It would be especially valuable for larger institutions which are spread out geographically, and for central research administration which sometimes are not aware of these types of links. Cost savings would be realized, and the "team science" concept could be implemented. I'm sure this is one of the purposes of systems such as Harvard Profiles and VIVO, which allow visualization of connections between researcher and departments.

#LAK12 Google vs. your English 101 instructor

LA is fundamentally ethically different from the analysis done by online vendors because of power differentials. The relationship between you and your English 101 instructor, or your institution of higher education as a whole, is much different than your relationship between you and Google or Facebook or Amazon.

When online, we often operate under the assumption that our activities are occurring in our bedrooms, so to speak. We don’t always consciously operate as if our online activities are actually occurring out in the scrutinizable open – although we should be aware of that, if we read the fine print in the terms of agreement. Yet, if they are aware, most people are OK with knowing that Google/Amazon/Facebook is mining their every tweet. Most of us don't know anybody working for those corporations, and even if we did, we know that we are one of multimillions...our online actions are just a drop in the big data bucket.

However, the dynamics between a student and the institution of higher education in which he/she is enrolled is quite different. In this case, the people analysing their data trail may actually KNOW them, and may have the power to award grades, offer or withhold a job reference, or dole out scholarships, work-study jobs, or internship leads. Now, our data is subject to the institutional gaze, and those eyes do have the power to reward and punish. It is a power differential tilted markedly toward the institution. Learning analytics may help a student persist along the path towards college completion, but I believe that the analysands' feelings about being scrutinized by individuals in positions of power (including their instructors) at a university would be much different than their feelings about Google or Facebook mining even that same pot of data.

Saturday, January 28, 2012

Learning Analytics getting increasing library press...LAK12

I've been Googling "learn* analyt*" and "librar*" from time to time for about a year now. A year ago, the only things that I could find were reports, not written by librarians, that mentioned academic library data as possible fodder for learning analytics and, of course, instances wherein "library" was used in a mere data sort of way, i.e. "data library." That's starting to change. Today I found:

The Library Impact Data Project, a JISC-funded project in the UK in which systems librarians at the Univ. of Huddersfield seek to describe the relationship between library usage and student attainment; identify courses that require a high level of library usage and describe the practices that lead to this; and identify courses that do NOT require library usage and improve services to those courses. I'm definitely going to look into this one more: http://library.hud.ac.uk/blogs/projects/lidp/

Stephen Abram's blog post from Jan. 3, 2012 that points to EDUCAUSE's "7 things about analytics" article, but without any analysis of it: http://stephenslighthouse.com/2012/01/03/learning-analytics/comment-page-1/#comment-19972

In 2010, this was published: Kumar,
S., Edwards, M. & Ochoa, M. (2010). Analysis of Online Students’ Use of
Embedded Library Instruction in a Graduate Educational Technology Course.
which used data logs from the LMS in which library instruction was embedded to examine student usage of library resources. Far beyond the usual self-reported "preferences/how did you like it?" type of research that we often see in library-related publications.

The Sept. 2011 issues of College and Research Libraries had an article by Wong and Cmor describing a study that linked GPA to library workshop attendance with n=8,000+ students. The librarians had access to the school's ERP to do this; interestingly, the study was done at a Hong Kong university and there is little mention of any consent from the students involved, although student names were unlinked from GPA, and also no mention of an IRB-type approval process in the way we would think of it in North America. The article is at http://crl.acrl.org/content/72/5/464.full.pdf+html I imagine the only similarity between this study and true LA is the link between the library and the ERP -- a link which was enacted outside the ERP, at any rate.

The Georgia Perimeter College Libraries' GPC blog by Rebecca Rose explains,
A library application for learning analytics could be a report alerting the library to order more materials in specific subject areas, based on enrollments in specific courses. Another idea would be to get lists of students who have never experienced library instruction. http://gpclibraries.wordpress.com/2011/04/21/technology-trends-in-libraries-pt-3/

And, finally, lots of library-related blog posts referencing the 2011 Horizon Report which listed LA as a technology with a 4-5 year time horizon. Right now, these were mostly informational posts, and did not offer any visions or analysis for LA applied to libraries.

So, overall, it appears that tech-focused librarians are becoming aware of LA, but outside of a couple of projects, libraries are not involved with LA very much at all. If I'm wrong, please leave a comment and give me a clue.

Thursday, January 26, 2012

LAK12 Learning Analytics: Some Questions from the Drive In

Driving to work this morning in the light rain I brainstormed about some possible questions about learning analytics:

What are the intellectual freedom and privacy implications of LA?

Can/should LA data be used in evaluations of teachers?

What are the philosophical assumptions/bases of LA? Is LA based on a behaviorist, humanist, progressive, etc. educational philosophy?

Who owns the data generated by students? (I'm pretty sure I have seen this addressed in some of the literature, perhaps not the LA literature, but elsewhere.)

Is this considered human subjects research and what are the informed consent implications of that? Can students opt out?

How will archiving of the data be managed? What can the data be used for in the future?

What are the questions that can be answered with the data? Which of the answers are actionable? i.e. "Data shows that students who visit the cafeteria 3 times by the 12th hour of the new semester in their first year have a higher chance of graduating in four years." Is this a cause or an effect? Can we act on this information?

During this MOOC, I want to see learner analytics technology in action and learn a lot more about the data side. But my primary interest is in the ethical questions.

Looking forward to tackling the reading list.

Wednesday, January 25, 2012

Analytics and the Academic Library: Peaceful Coexistence? LAK12

There is one thing you must know about librarians: we prize intellectual freedom and the privacy of the individual user. For example, the American Library Association's "Intellectual Freedom Principles for Academic Libraries" states:
"The privacy of library users is and must be inviolable. Policies should be in place that maintain confidentiality of library borrowing records and of other information relating to personal use of library information and services."

In many US states, there are laws stating that library usage records can only be divulged in the case of user consent, subpoena, or court order. Librarians throughout American libraries have resisted the US Patriot Act, which infringes on the civil liberties of library users to access information free from the "watchful eye" of the government.

Given these values, imagine my surprise to read statements such as this one, made by George Siemens and Phil Long in their piece in EDUCAUSE Review, "Penetrating the Fog: Analytics in Learning and Education:" "Similarly, most analytics models do not capture or utilize physical-world data, such as library use, access to learning support, or academic advising." [Here, they weren't stating that it shouldn't be done, but that libraries presented a data stream that just has not been captured yet.]

Don't get me wrong...I am deeply interested in learning analytics from the viewpoint of an educator and learner and I think that it holds promise in reaching goals of degree completion and retention (if these are indeed the right goals). But can academic library usage data be used in learning analytics in an ethical manner that remains true to ideals deeply held by my profession? Or will the ideals of intellectual freedom and privacy be overcome by the college completion agenda? These questions I look forward to exploring in LAK12.

Tuesday, January 24, 2012

LAK12: A Poem to Begin

I can't help but recall a piece by one of my favorite poets which keeps me grounded in the technology and information onslaught. T.S. Eliot published this in 1934. "Where is the knowledge we have lost in information? Where is the wisdom we have lost in knowledge? "

Opening Stanza from Choruses from "The Rock"

The Eagle soars in the summit of Heaven,
The Hunter with his dogs pursues his circuit.

O perpetual revolution of configured stars,

O perpetual recurrence of determined seasons,

O world of spring and autumn, birth and dying

The endless cycle of idea and action,
Endless invention, endless experiment,
Brings knowledge of motion, but not of stillness;
Knowledge of speech, but not of silence;
Knowledge of words, and ignorance of the Word.
All our knowledge brings us nearer to our ignorance,
All our ignorance brings us nearer to death,
But nearness to death no nearer to GOD.
Where is the Life we have lost in living?
Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?
The cycles of Heaven in twenty centuries
Bring us farther from GOD and nearer to the Dust.

T. S. Eliot (1888-1965),
The Rock (1934)