Libraries and Analytics: 2012

Monday, November 12, 2012

Reply to "Putting Learners in the Driver's Seat With Learning Analytics

Over at Online Learning Insights, Debbie Morrison gives an overview of learning analytics and posts about a worrisome announcement made by e-textbook vendor Course Smart. The company is now offering to instructors data on student engagement with textbooks. Ms. Morrison's concern is that "eyeballs on textbook" do not equate to learning:

"Yet Course Smart’s [in my opinion] program is an example of learning analytics gone awry. The ‘packaging up’ as mentioned by Ms. Clarke refers to the program Course Smart developed with data on students’ reading patterns. The program looks at how students interact with the e-textbooks, the number of times a student views a page and for how long, highlights made, etc. Course Smart compiles this ‘data’ and sends a Student Engagement Report to professors. Are these metrics a true measure of a student’s level of engagement? "

I also find this announcement troubling but for another reason besides for the assumptions it reveals about student engagement. I find the monitoring of online textbooks troubling for the same reason that I sometimes feel uncomfortable when someone looks over my shoulder and asks, "What are you reading?" My question is, will this type of surveillance have a chilling effect on intellectual freedom of the learners? Learning should be a time to be able to engage in low-stakes "play" that is not constantly monitored and evaluated. If a learner knows that her every move in the textbook is being monitored, it raises the level of stakes for that learner. In my opinion, it is not necessarily the monitoring of material that has actually been assigned that is troublesome -- it's monitoring of the material that was not assigned, but that the student found through serendipity or just looks interesting. What about material that is controversial, or resonates with a learner for personal reasons? What about the learner who is interacting with text related to a personal health issue, or a relationship problem? Do you really want others to see what we have highlighted in our own books, especially if we know that others might learn something about us that we don't necessarily want them to know? Knowing that our reading is being monitored -- not just by an impersonal vendor like Amazon or Course Smart, but by our own instructors -- will make us more careful about what we read. It impinges on our intellectual freedom.

Thursday, July 26, 2012

Sharing data about library usage, not about subject content

I watched a fantastic presentation on the University of Minnesota's Library Data and Student Success Project. This project analyzed library usage across 5 types of transactions: book loans, use of library e-resources, use of library computer workstations, online reference transactions, and instructional workshops. Although they had to tie the usage to individual users, the data was only reported in the aggregate. I now have an itch to reproduce their methodology at our library.

In terms of privacy, they have an interesting graphic that indicates that they had to shift their practice a little outside of the customary library paradigm in which no user identified data is shared. They are sharing data about library usage, but not about content. So don't worry, Gophers, the library is only sharing the fact that you checked out books, not that you were reading Fifty Shades of Grey.

We kept this:	But not this:
Checked out X books	Actual book titles
Attended X workshops	Actual workshops
Reference interaction	Substance of interaction
Logged into library workstation	Date, location, duration
Used an ejournal	Actual ejournal title

(from University of Minnesota Libraries, http://blog.lib.umn.edu/ldss/2012/05/a-word-about-privacy.html)

Do librarians obsess about user privacy to an unnecessary extent? Perhaps. I'll share a true story from my workplace: two of my librarian colleagues approached me with concern, saying that a teaching faculty member had come to the library to inquire whether a couple of his students had been there that morning. It seems that they had ducked out of a required lecture stating that they had to go to the library to take care of an IT-related task, and the instructor was doublechecking their story. The request made my colleagues very uncomfortable. In discussing it with them, it became clear that part of their reticence in sharing this information was because they felt that sharing whether a person was in the library would be infringe on users' privacy.

Would most librarians, with our deeply held values regarding user privacy, confidentiality, and intellectual freedom, agree that this type of user tracking is in keeping with those values? Or do our values need to shift in order to realize the potential value for students and our own institutions that can be uncovered by this type of analysis?

Tuesday, July 24, 2012

Learning analytics at Penn State

What is Penn State planning to do with learning analytics? As a faculty member at Penn State, I'm interested in this question, so I've tried to explore this question over the past few months.

In April, several individuals from across Penn State's campuses virtually attended the EDUCAUSE Learning Initiative event on learning analytics.

Also in April, a presentation by Chris Millet, Simon Hooper, and Bart Pursel, individuals from various educational technology groups at the University, was held at University Park. This presentation offered to the campus community a background on learning analytics and outlined current work being done in learning analytics, which includes:
          --Early Progress Report: this is a low-level use of LA which is nonetheless helpful to students and faculty as it automates the process of alerting faculty and academic advisers of students who are receiving a C or less in any course as of the third week of the semester. Not only is the faculty alert automated, but students also receive an email message alerting them to their own progres.
           --examination of the relationship between blog and wiki posting in the learning management system with course GPA. Do students who are more connected and communicative with course colleagues fare better in terms of grades?
            --Simon Hooper is helping faculty design better multiple choice tests by analyzing student performance on discrete test questions and comparing it to overall GPA and performance on other assignments involving specific learning objectives. In doing so, "bad" test questions -- those that don't discriminate well between those who have mastered a specific learning objective and those who haven't -- can be eliminated or redesigned.

Then in May I met with Chris Millet from Teaching and Learning with Technology, Penn State's educational technologies group. He had just returned from LAK12, the Second Annual Conference on Learning Analytics and Knowledge and he graciously shared with me some of what he learned at that conference. Millet also described some of the work of a recently-formed Learning Analytics group that has been charged by Penn State Provost Robert Pangborn to explore and implement the use of learning analytics at the University.

There still is a lot of work to be done in developing capacity in this area at the University. The choice of a new learning management system to replace ANGEL, our current LMS, will also impact the adoption of learning analytics. Many LMSs now have learning analytics components built in. Unfortunately any University-wide implementation of learning analytics will be hampered by the College of Medicine's choice to adopt Moodle, which is one of the LA systems that apparently is not being considered by groups elsewhere at the University. Furthermore, according to Millet, only about 75% of faculty across PSU have even adopted ANGEL, the current LMS. Will the numbers improve with a new LMS? Perhaps the 25% of faculty who have not adopted ANGEL have good pedagogical reasons for not doing so -- maybe they're using technology in other ways.

I noted with interest this quote from Chris Millet concerning data sources for LA: "The analysis of this data, coming from a variety of sources like the LMS, the library, and the student information system, helps us observe and understand learning behaviors in order to enable appropriate interventions.” Again, mention of the library as a source of data. Yet I wonder how many librarians know about learning analytics and are currently considering how libraries might be involved?

Friday, July 20, 2012

A reaction to the ELI Brief, "Learning Analytics: Moving from Concept to Practice"

EDUCAUSE Learning Initiative has just issued a new brief, Learning Analytics: Moving from Concept to Practice. It is a synthesis of discussions at the Learning Analytics and Knowledge Conference (LAK12) and the ELI 2012 Spring Focus Session. Here are some reactions from an academic librarian:

Learning analytics systems are built around assumptions about the variables that predict and indicate academic success of students.

At academic institutions using learning analytics, one of the most important decisions is what pieces of information about a student will predict his/her success or indicate that he/she is succeeding? If a student's high school GPA will predict their performance in their first year of college, then we need to feed that information into the system and use it in our predictive models. If the number of times a student eats in the cafeteria in week two of the semester is unrelated to academic success, then we don't need to get data from the cafeteria. But what if we don't yet know--because we had no way of mining that data until now-- what variables are truly indicative or predictive of academic success? It would seem that getting as much data as possible into your system, and then mining it, would be the way to go. If library usage is correlated to academic success, then we need to put it into the system, but what if we don't really know yet that it is correlated? Then, it would seem that mining library data as part of learning analytics is the way to prove this.

Visualization tools in learning analytics make the data understandable to users of the system, including students and faculty.

However, Santos and Duval of Katholieke Universiteit Leuven report that some students said they didn't like other students being able to see their activity on the analytics dashboard -- in the cases where each individual student's effort is compared with others in the course to benchmark individual effort. A potential intellectual chilling effect? A violation of privacy? This point is related to library values regarding user privacy and confidentiality.

Learning analytics, in the end, are only as good as the followup. I concur.

If institutions do not act on the information they gather from learning analytics, then it is simply surveillance and not truly related to teaching and learning. Perhaps this is where libraries can best be involved in learning analytics: helping at-risk students with learning interventions. More on this later.

Friday, March 2, 2012

#LAK12 upcoming EDUCAUSE learning analytics focus session

The announcement about the ELI 2012 Online Spring Focus Session on learning analytics arrived in my inbox yesterday. Yet again, the reference to libraries:

"The analysis of this data, coming from a variety of sources like the LMS, the library, and the student information system, helps us observe and understand learning behaviors in order to enable appropriate interventions. "

Because EDUCAUSE's focus is the management and use of and leadership in information resources, librarians are a significant portion of the group's constituency. So I'm glad, in fact, that librarians are explicitly listed in the focus session announcement as a target audience.

However, since the sharing of user data (at least personally identifiable) would seem to be against the librarians' professional code of ethics, I'm still stumped as to how libraries have been placed as possible participants in the learning analytics sphere. Don't get me wrong...I'm glad to have been asked to the party, but I'm not sure I'm going to want to dance. Some libraries are involved in analytics (for example, the Library Impact Data Project in the UK, which I learned about from commenters on this blog) and I'm curious as to how their projects work and still honor values such as intellectual freedom and user privacy.

Wednesday, February 29, 2012

#LAK12 Gamifying SNAPP

In a recent blog post over at David Jennings' Quickened with the Firewater blog David asks: "What would happen if we put learners in charge of analysing their own data about their performance?"

Here's how it could happen with SNAPP, which is a software tool that helps online instructors analyze the communication patterns among students who use the LMS' communication tool:

the instructor would allow communication to happen using the LMS, perhaps providing some task guidance or parameters to inspire the online discussion.

The network visualization (like the one at the right, from the SNAPP group) could be shared with the students, who would use the visualization to help them think critically about how communication unfolded and what they individually added to the discussion.

Learners could then be challenged to change the visualization by changing their own communication patterns as a group, thereby gamifying the system.

Learners would gain insight into their group communication processes and play with different communication roles. This might be especially useful in a management or communication course, but such insights would be valuable to anyone no matter their course of study. But what learners might also gain, in addition to learning about communication principles, is insight into how such systems can be manipulated, and how the systems might also be used to manipulate learners -- both worthy learning goals.

P.S. thank you Shane Dawson, for a fantastic presentation yesterday about SNAPP for the learning analytics massively open online course!

Wednesday, February 22, 2012

#LAK12 Word Cloud of Data Privacy & Ethics Chat

We had a lively online chat during Erik Duval's presentation on privacy and ethics in learning analytics.
Here are some details that you can see on the word cloud:
--"brother" as in Big...some of us brought up the "Big Brother" theme when the question was asked, "What do you worry about with sharing your data?"
--Lies vs. truth: is it considered lying when you present yourself as other-than-you? Is it your responsibility to present the truth about yourself online?
--transparency is a concern: who can see the data? who can see the models?
--power: who has it in learning analytics? who doesn't? Do teachers, learners, or administrators have power in the system?
--We discussed Google and Target as two corporations who are mining our data for marketing insights.
--medical: How are learning analytics issues similar to issues encountered with personally identifiable medical information?

Monday, February 20, 2012

#LAK12 Do Computer Scientists Do Science?

Dragan Gasevic's presentation about evidence-based semantic web shows that the software engineering field is beginning to adopt the paradigm of evidence-based practice (EBP) which has already been increasingly adopted in medicine, nursing, education, social work, and other human services fields.

In the evidence-based paradigm (and it indeed is a paradigm shift, especially for the medical field that birthed it), the randomized controlled trial is considered to be the study type offering the strongest possible evidence to support a hypothesis. (Note, however, that not every research question lends itself to an RCT. In software engineering, there may be other study designs which are more appropriate.) Systematic reviews and meta-analyses of many randomized controlled trials represent even stronger evidence. As its name suggests, a systematic review requires a systematic and methodical search of the literature in order to present an overall synthesis of results from the highest-quality studies that can be located. A meta-analysis goes several steps farther in that you would take the results of several related studies (all of them RCTs, or cohort studies, or case control studies, etc.) and pool the data, with the effect of creating one large study which can then be analyzed. Both systematic reviews and meta-analyses are important contributions to a field. Although we might not think of them as empirical scientific studies in themselves, they synthesize the entire body of empirical work that has been done on a topic to date. This synthesis is more than the sum of the results of the component studies.

The state of software engineering, as Gasevic and others seem to point out, is that most so-called evidence in the field consists of case studies or even simply expert opinion -- both at the very bottom of the EBP evidence hierarchy. The higher-level empirical studies that have been performed are often with small n, thereby decreasing the power of the studies to detect a statistically significant intervention effect. This is similar to the situation in other fields that have begun to adopt EBP. Gasevic shows that the study design methods, sampling methods, and data collection methods of published papers in software engineering are lacking in quality. If science is the application of rigorous methods to hypothesis testing, then is this a situation wherein computer scientists & engineers aren't practicing much science at all?

Sunday, February 19, 2012

#LAK12 Semantic Web: Glimpses of Understanding

What I understand about the semantic web is that it:

-- relies on metadata that codes the metaphysical identity of a piece of data and how it relates to other things or concepts. Is it an image? a person? an idea? An author? This reminds me of bibliographic cataloging in libraries. ("The Semantic Web: An Introduction")

--exists on a very small scale currently ( Tim Berners-Lee's TED talk).

--would allow us to find and visualize relationships between any two bodies of information, whether the information is a person or an image or a body of raw data.

--would allow the implementation of learning analytics on a much more far-reaching scale.

This is what I understand and it isn't much, but what I don't understand is a lot. I really did not understand the "semantic web: an introduction" paper, nor the specifics of Hilary Mason's talk, but I found it fascinating none the less, and was particularly pleased that she used a disease-related example to illustrate Bayesian statistics. I made a connection from that to the concept of evidence-based medicine which I've also been learning about in the past couple of years.

Still need to watch Dragan Gasevic's presentation from last week. Perhaps it will make all things clear.

Tuesday, February 14, 2012

#LAK12 Knewton Love

Thoughts from my viewing of the Knewton video:

Flashback to 1992: I'm teaching at a private high school in Memphis, Tennessee. It's my first year of teaching at this school AND my first year of teaching high school. With 5 sections of students in two subjects, finding time to simply deal with the paperwork, classroom discipline, and preparing a new lesson plan for each day is a challenge -- let alone individualizing instruction for my students. A parent contacts me about her son. Without being overly accusatory, she tells me that one of the reasons she placed him in this school is that she hoped he would get some specialized instruction, but that's not happening. He's a gifted student, and now he is bored. I feel frustrated because just devising a single lesson plan to reach the average student is challenging enough; there's no way (I felt at the time) I can meet his needs too.

Knewton would have helped. It represents a way to use LA to do what all good teachers should be doing, but many don't have enough time to do:

--group students by learning preference/style, rather than by ability. This allows faster learners to learn more by teaching their peers.

--identify "study buddies" for students based on specific concepts plus learning preferences. Again, this allows students who have a mastery of a concept to learn more by sharing their mastery, plus students who still need to learn a concept are surrounded by more potential teachers. This would allow teachers to implement peer teaching in their classrooms.

I did notice at least one red flag: towards the end of the video, the narrator mentions that one of the benefits of Knewton for publishers is the establishment of a "lifetime relationship with students" allowing them to develop rich data on a student that could not be "shared, mined, or pirated." There's the catch! And raises the question, who really owns this data? The student or the system designer?

#LAK12 Community colleges: the perfect candidates for LA

As illustrated by Vernon Smith's presentation on Rio Salado's implementation of LA, community colleges would serve as the perfect testbed/incubator/adopter of learning analytics. Here's why:

1) Community colleges are bursting at the seams. Taken as a group, they represent a very large population = large "n" for doing analytics.

2) Community college faculty, for the most part, are there to teach, not to do research. There is a pragmatic focus on student success. Therefore, there would probably be more buy-in at the faculty level for LA.

3) You have a larger "n" of "at risk" students. Percentage-wise, the"at-risk" students represent a larger chunk of the community college population since the barriers to entry (such as price and high school performance) are lower than those at a four-year college/university. Therefore, you would probably get more return on investment with implementation of LA due to increased retention of these students.

4) In terms of human development/capacity, the ROI would probably be much greater at the community college level too, as many CC students are first-generation college students. Retaining a greater number of these students would have huge positive implications for society at large. (especially if education helps them see the need for changing the system status quo)

In contrast, I work at a medical school where we have a huge barrier to entry and a population of only about 145 students per cohort. This is a small "n" and I don't believe retention is an issue at all. "Completion" is not our concern. Improved learning is our concern, but in terms of return on investment for American society, I think money spent on LA at the community college level would yield a greater payoff. (unless, that is, the federal government and medical schools start aggressively pursuing a different student population in hopes of solving the huge problem of lack of access to health care of rural and urban Americans...)

Tuesday, February 7, 2012

#LAK12 UMBC MyActivity vs. Purdue Signals: BEATDOWN

How do the UMBC and Purdue systems compare? I watched the video demo of University of Maryland Baltimore County's CheckMyActivity analytics tool and compared it to John Campbell's presentation on Purdue's Signals, as well as Kimberly Arnold's EDUCAUSE piece on Signals.
Intuitiveness:

SIGNALS. The use of color instantly symbolizes student grade status. The traffic light symbol is easy to understand and a powerful message and is really the genius of this system. The UMBC system, on the other hand, leverages the functionality of the course management system but, in so doing, also retains the somewhat IT-centric wording ("hits", "sessions," "generate report" and "get gradebook items") and the non-intuitive structure of the reports. It was difficult for me to quickly ascertain which column on the CheckMyActivity reports were the most important to focus in on for student performance (average hits per user? sessions?) ; so I'm sure that students probably have the same difficulty. Plus, you still must interpret what the numbers mean. You don't have to do that in Signals.

Visual Appeal:

SIGNALS. The red/green/yellow colors are attractive and the interface looked more streamlined than UMBC's system . Plus, they have an attractive logo, and that's always a plus.

In-Your-Face quotient:

SIGNALS by a landslide. Students have to take action to access the UMBC tool; with Signals, no additional action is necessary past logging in to the CMS, which they will do anyway. This is a limitation of UMBC's system. In fact, 16% of students in one course who responded to a survey said that they had never used UMBC CheckMyActivity at all. The fact that the red, yellow, and green lights are staring students in the face every time they log on to the CMS makes Signals impossible to ignore and therefore, students are probably more likely to follow up on the instructor's remediation suggestions.

Model-Building Quality: UNKNOWN. I would really like to see what data models lie behind each system, but I haven't found that information yet. UPDATE 2/9/2012: I suppose CheckMyActivity does not actually have any models upon which it is built. The students are only seeing the raw data of their log-ins to the course compared to the log-ins of people with grades of A, B, C, etc. Therefore UMBC's CheckMyActivity tool would be an example of "transaction only" level of sophistication rather than Purdue Signals' "predictive modeling" level of sophistication.

#LAK12 What teaching/learning and technical skillsets does LA require?

In today's LAK12 MOOC presentation, John Campbell of Purdue's Signals Project stated that there are two important skills domains in learning analytics: technical AND teaching/learning. In fact, both must be present in order to effectively design tools in learning analytics. He stated that both of these sets are very difficult to find in one person, and at Purdue, they have had the most luck in looking for people with the teaching & learning skillset and THEN training them for the technical side.

So here's the question: For success in learning analytics, what teaching & learning skills would you look for? What technical skills?

#LAK12 Recap of "Purdue Signals" presentation by John Campbell

Great presentation by John Campbell of Purdue's Signals Project! If you didn't participate live, you'll want to watch the recording. Highlights for me were:

1) Hearing from a creator about the nuts and bolts of design, implementation, and ongoing maintenance and improvement of an analytics tool that is currently used by students in 80 introductory courses, with the goal of use by 20,000 unique student users.

2) Students have lodged no complaints about privacy issues and are positive about its use; participating faculty are so positive that it works that they have begun refusing to participate in controlled trials of Signals.

3) Privacy considerations include: faculty can only view data for their own students. They have begun IRB approval was obtained for both testing (easy IRB process) and implementation (more time-consuming IRB process) phases. They don't let students "opt out..." I guess the principle is that

4) Customization of messages to students and the platform on which they're delivered is important...email doesn't work; slight customization of messages does work for grabbing student attention. (I've found that Facebook works for getting instantaneous responses from students, especially during class time when they are ALL monitoring it!!)

5) Excellent "practical suggestions" which he ended with...such as, "think about the theoretical basis of your analytics project." The Purdue Signals project is based on Tinto's input/environment/output model (actually I think it's Astin's model as the link shows...)

6) AND last but not least...that it is very difficult to find people with the teaching/learning AND technical skillsets that are required. They are finding that in-house training works for the technical part.

#LAK12 Is LA strictly behaviorist?

I thank Bianka Hadju for calling my attention to this statement about behaviorism in Siemens' and Long's Penetrating the Fog piece:

Since we risk a return to behaviorism as a learning theory if we confine analytics to behavioral data, how can we account for more than behavioral data?

Behaviorism is a psychological and educational theory which, as alluded to in the statement, is no longer in favor in most educational circles. The primary criticism of behaviorism in plain terms is that this educational theory would simply explain differences between learners by observing and measuring their outward behaviors. Behaviorism does not account at all for inner mental states, thoughts, feelings, cognition, meanings ascribed to events by learners, etc. If there is no difference in behavior, then there is no difference between the learners. A fascinating explanation of the theory and it's criticisms is available at the online Stanford Encyclopedia of Philosophy.

This ties back to Ryan S.J.d. Baker's presentation last week about model building in educational data mining. He described an online reading tutor system which was tested with U.S. students and students in the Phillipines. Baker used the system to build and test his analytic model of "gaming the system." He could tell who was gaming the system by observing behaviors such as the click patterns, order of clicks, wait time, etc. What was even more interesting to me, though, was his finding that although both groups of students gamed the system, the meanings behind this observable action --their feelings about the activity, their attitudes toward the use of the system -- were vastly different in each cultural group. The students' feelings and attitudes, which are hugely important to the educational process, cannot be measured by clicks at all. The two groups displayed the same behavior, but their reasons for doing so were very different.

Will we focus on measurable behaviors without considering the other important changes we wish to inculcate as educators, such as attitudinal and affective changes?

Looking forward to hearing about Purdue's "Signals" project today!

Tuesday, January 31, 2012

#LAK12 Model-building in EDM

Ryan Baker's presentation on educational data mining today as part of the Learning Analytics and Knowledge MOOC helped me to better understand the model development concept in EDM. One of his slides listed several types of learner behaviors for which he developed models. When learners interacted with educational software, he was able to describe certain learner behavioral patterns (number of clicks, wait time between clicks, order of clicks, etc.) in such a way that if the data log files were analyzed for any learner, those behaviors could be spotted. He could identify when a learner was exhibiting behaviors such as:

Photo by Tim Duckett (tim_d)

carelessness

off-task activity

gaming the system

avoiding help when they needed it

not asking for help because they didn't need it

guessing

I'd be very interested in hearing more about the model development step, and how the researchers constructed meaning from a pattern of clicks.

There are two observations that I took away from the presentation.

1) Model-building is inherently value-laden. Baker alluded to this a bit when he mentioned that learner behaviors that were observed in one of his modelling tests were vastly different in the Phillipines as compared to the United States. Learner behaviors are conditioned by culture and situated in a culture. As educators, we too are the product of our culture and cannot avoid building our assumptions into every tool and system that we develop. Since these data models are used to classify learners, model builders need to approach their task with the utmost humility and care.

2) The educational data mining approach seems to be an example of behaviorism. There was much discussion about the observable behaviors of the learners but not much about the learners' internal thoughts and feelings. Baker did allude a bit to thoughts and feelings of U.S. and Phillipines students when they interacted with his Scooter software. For me, that was the most interesting part of the presentation.

Monday, January 30, 2012

#LAK12 No Shadow Databases!

Respondents to the ECAR study by Goldstein said that they have not had a lot of success in reducing the presence of shadow databases. Shadow databases are "silo" information systems maintained by separate functional units in the organization (or even within the same unit) which contain the same types of information, or perform similar functions. The problem is, usually they do NOT contain the same information. Shadow databases proliferate when there is weak oversight of the institution at the highest levels of administration, when little communication exists between units, and when individuals are rewarded not for sharing information but for hoarding it. A perfect description of many institutions in the world of higher education.

The problem with shadow databases is that they represent wasted time, effort, and money. Not just by wasting the opportunity to learn from others' information, but sometimes even in the money put toward purchasing redundant systems. But they do serve an important sociological function in the workplace: preserving the autonomy of a unit within a large organizational structure.

Institutions with "performance dashboards" as part of their analytics technology report the best outcomes for eliminating shadow databases. No doubt this is because the dashboard tools are very attractive analysis tools for almost every organizational unit, and this makes units want to opt in. Just as likely, however, is that these systems suddenly make it glaringly obvious whose information is missing, thereby providing some pressure from peers and higher-ups to participate.

Eliminating shadow databases is indeed a business improvement that the faint-hearted should not attempt.

#LAK12 examples of 5 levels of sophistication

Goldstein's Academic Analytics: The Uses of Management Information and Technology in Higher Education (2005) mentions "five levels of sophistication of use" of academic analytics.

1) transaction only: i.e. "How many students logged in to Course Y in the last 24 hours?" You would run a query on the learning management system but it would only happen on command.

2) "analysis and monitoring of operational performance": An illustration of might be: "What is the trend in Mediasite (lecture capture) usage throughout the semester?" This specific question is especially appropriate if your resource is limited to a certain number of concurrent users, for example. You can do this analysis with the previous level if you download into Excel, create graphs, etc. but it is easier if your data tool provides an automatic visualization.

3) "What-if decision support, such as scenario building": I can't even think of an example of this. Please comment and give me an example if you know of one. I'll give the next one a try...

4) "Predictive modeling and simulation": An illustration of this might be "How should classroom scheduling practices have to change 4 years from now if we know that we have X students in the Class of 2015 with certain types of characteristics?"

5) automatically triggering a business process. This would be akin to a Walmart-like situation wherein when a customer buys a stock item with a certain SKU, stock levels of that SKU are checked and compared with rate at which that SKU has been purchased that week and average time to reship from vendor, then depending on those results, an automatic order is sent to the vendor. I think this is what LA aspires to. For example, student doesn't log in to the CMS for a certain number of days and depending on their demographics or the time of the semester, student services or the course instructor sends out an exploratory, "Hey, how's it going? How can we help?" email, which is most likely automated as well.

Note that illustration #4 offered above tend to be more "academic analytics" use, which I interpret as more institution-based, i.e. how can we run this institution more efficiently? Illustrations #1 and 5 are more "learning analytics" focused, which I interpret as learner-based, i.e. How can we help an individual or class of individuals succeed academically?

The article further mentions that central research administration is one of the organizational units least likely to use analytics, which is rather telling. Here's a possible use: strengths of association between academic departments or individual research faculty are analyzed through bibliographic citation data, or in subject tags for their publications (like MeSH headings, for those of you who use PubMed). Individuals who are found to be highly associated in terms of the domains in which they work could be potential collaborators, joint/bulk purchasers, etc. It would be especially valuable for larger institutions which are spread out geographically, and for central research administration which sometimes are not aware of these types of links. Cost savings would be realized, and the "team science" concept could be implemented. I'm sure this is one of the purposes of systems such as Harvard Profiles and VIVO, which allow visualization of connections between researcher and departments.

#LAK12 Google vs. your English 101 instructor

LA is fundamentally ethically different from the analysis done by online vendors because of power differentials. The relationship between you and your English 101 instructor, or your institution of higher education as a whole, is much different than your relationship between you and Google or Facebook or Amazon.

When online, we often operate under the assumption that our activities are occurring in our bedrooms, so to speak. We don’t always consciously operate as if our online activities are actually occurring out in the scrutinizable open – although we should be aware of that, if we read the fine print in the terms of agreement. Yet, if they are aware, most people are OK with knowing that Google/Amazon/Facebook is mining their every tweet. Most of us don't know anybody working for those corporations, and even if we did, we know that we are one of multimillions...our online actions are just a drop in the big data bucket.

However, the dynamics between a student and the institution of higher education in which he/she is enrolled is quite different. In this case, the people analysing their data trail may actually KNOW them, and may have the power to award grades, offer or withhold a job reference, or dole out scholarships, work-study jobs, or internship leads. Now, our data is subject to the institutional gaze, and those eyes do have the power to reward and punish. It is a power differential tilted markedly toward the institution. Learning analytics may help a student persist along the path towards college completion, but I believe that the analysands' feelings about being scrutinized by individuals in positions of power (including their instructors) at a university would be much different than their feelings about Google or Facebook mining even that same pot of data.

Saturday, January 28, 2012

Learning Analytics getting increasing library press...LAK12

I've been Googling "learn* analyt*" and "librar*" from time to time for about a year now. A year ago, the only things that I could find were reports, not written by librarians, that mentioned academic library data as possible fodder for learning analytics and, of course, instances wherein "library" was used in a mere data sort of way, i.e. "data library." That's starting to change. Today I found:

The Library Impact Data Project, a JISC-funded project in the UK in which systems librarians at the Univ. of Huddersfield seek to describe the relationship between library usage and student attainment; identify courses that require a high level of library usage and describe the practices that lead to this; and identify courses that do NOT require library usage and improve services to those courses. I'm definitely going to look into this one more: http://library.hud.ac.uk/blogs/projects/lidp/

Stephen Abram's blog post from Jan. 3, 2012 that points to EDUCAUSE's "7 things about analytics" article, but without any analysis of it: http://stephenslighthouse.com/2012/01/03/learning-analytics/comment-page-1/#comment-19972

In 2010, this was published: Kumar,
S., Edwards, M. & Ochoa, M. (2010). Analysis of Online Students’ Use of
Embedded Library Instruction in a Graduate Educational Technology Course. which used data logs from the LMS in which library instruction was embedded to examine student usage of library resources. Far beyond the usual self-reported "preferences/how did you like it?" type of research that we often see in library-related publications.

The Sept. 2011 issues of College and Research Libraries had an article by Wong and Cmor describing a study that linked GPA to library workshop attendance with n=8,000+ students. The librarians had access to the school's ERP to do this; interestingly, the study was done at a Hong Kong university and there is little mention of any consent from the students involved, although student names were unlinked from GPA, and also no mention of an IRB-type approval process in the way we would think of it in North America. The article is at http://crl.acrl.org/content/72/5/464.full.pdf+html I imagine the only similarity between this study and true LA is the link between the library and the ERP -- a link which was enacted outside the ERP, at any rate.

The Georgia Perimeter College Libraries' GPC blog by Rebecca Rose explains,
A library application for learning analytics could be a report alerting the library to order more materials in specific subject areas, based on enrollments in specific courses. Another idea would be to get lists of students who have never experienced library instruction. http://gpclibraries.wordpress.com/2011/04/21/technology-trends-in-libraries-pt-3/

And, finally, lots of library-related blog posts referencing the 2011 Horizon Report which listed LA as a technology with a 4-5 year time horizon. Right now, these were mostly informational posts, and did not offer any visions or analysis for LA applied to libraries.

So, overall, it appears that tech-focused librarians are becoming aware of LA, but outside of a couple of projects, libraries are not involved with LA very much at all. If I'm wrong, please leave a comment and give me a clue.

Thursday, January 26, 2012

LAK12 Learning Analytics: Some Questions from the Drive In

Driving to work this morning in the light rain I brainstormed about some possible questions about learning analytics:

What are the intellectual freedom and privacy implications of LA?

Can/should LA data be used in evaluations of teachers?

What are the philosophical assumptions/bases of LA? Is LA based on a behaviorist, humanist, progressive, etc. educational philosophy?

Who owns the data generated by students? (I'm pretty sure I have seen this addressed in some of the literature, perhaps not the LA literature, but elsewhere.)

Is this considered human subjects research and what are the informed consent implications of that? Can students opt out?

How will archiving of the data be managed? What can the data be used for in the future?

What are the questions that can be answered with the data? Which of the answers are actionable? i.e. "Data shows that students who visit the cafeteria 3 times by the 12th hour of the new semester in their first year have a higher chance of graduating in four years." Is this a cause or an effect? Can we act on this information?

During this MOOC, I want to see learner analytics technology in action and learn a lot more about the data side. But my primary interest is in the ethical questions.

Looking forward to tackling the reading list.

Wednesday, January 25, 2012

Analytics and the Academic Library: Peaceful Coexistence? LAK12

There is one thing you must know about librarians: we prize intellectual freedom and the privacy of the individual user. For example, the American Library Association's "Intellectual Freedom Principles for Academic Libraries" states:
"The privacy of library users is and must be inviolable. Policies should be in place that maintain confidentiality of library borrowing records and of other information relating to personal use of library information and services."

In many US states, there are laws stating that library usage records can only be divulged in the case of user consent, subpoena, or court order. Librarians throughout American libraries have resisted the US Patriot Act, which infringes on the civil liberties of library users to access information free from the "watchful eye" of the government.

Given these values, imagine my surprise to read statements such as this one, made by George Siemens and Phil Long in their piece in EDUCAUSE Review, "Penetrating the Fog: Analytics in Learning and Education:" "Similarly, most analytics models do not capture or utilize physical-world data, such as library use, access to learning support, or academic advising." [Here, they weren't stating that it shouldn't be done, but that libraries presented a data stream that just has not been captured yet.]

Don't get me wrong...I am deeply interested in learning analytics from the viewpoint of an educator and learner and I think that it holds promise in reaching goals of degree completion and retention (if these are indeed the right goals). But can academic library usage data be used in learning analytics in an ethical manner that remains true to ideals deeply held by my profession? Or will the ideals of intellectual freedom and privacy be overcome by the college completion agenda? These questions I look forward to exploring in LAK12.

Tuesday, January 24, 2012

LAK12: A Poem to Begin

I can't help but recall a piece by one of my favorite poets which keeps me grounded in the technology and information onslaught. T.S. Eliot published this in 1934. "Where is the knowledge we have lost in information? Where is the wisdom we have lost in knowledge? "

Opening Stanza from Choruses from "The Rock"

The Eagle soars in the summit of Heaven,
The Hunter with his dogs pursues his circuit.

O perpetual revolution of configured stars,

O perpetual recurrence of determined seasons,

O world of spring and autumn, birth and dying

The endless cycle of idea and action,
Endless invention, endless experiment,
Brings knowledge of motion, but not of stillness;
Knowledge of speech, but not of silence;
Knowledge of words, and ignorance of the Word.
All our knowledge brings us nearer to our ignorance,
All our ignorance brings us nearer to death,
But nearness to death no nearer to GOD.
Where is the Life we have lost in living?
Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?
The cycles of Heaven in twenty centuries
Bring us farther from GOD and nearer to the Dust.

T. S. Eliot (1888-1965),
The Rock (1934)