How to Sync Files From My Laptop to Remote Server

Context

I received phone call from one person who with his frightened voice said: Dermatologist!. I wanted to say kindly that that person probably has got wrong number because I am not dermatologist and I cannot give any medical advise about skin issues except to suggest to someone to visit dermatologist. But, that person said: I am dermatologist. I have very big problem. I have to transfer a huge number of files from one computer to another. I heard that you can help me. I have got more than fifteen thousands of images, pdf files and I have to transfer them to some server. I have been told that I will receive ssh details. What is that?

We met each other and in our conversation I realized that a huge repository of medical images, scientific texts and program files has to be migrated to new server which will be used by scientists to share scientific research. Dermatologist has got laptop with GNU/Linux and he has no previous experience in working with procedures and software in order to accomplish his task.

It is doable, I said. But, you have to be precise in typing commands and respect order of steps what is needed to do. Focus, concentration and accuracy are keywords, I emphasized. I hope I will be able to do that, he said.

First step – SSH

SSH is a secure communication protocol used to connect to remote GNU/Linux server. SSH stands for Secure SHell. Many people who are not used to work with servers expect that hey will see some complicated graphical interface. There is no complicated graphical interface, there is textual interface without buttons, dropdown menus, various radio and other boxes. Is that even more complicated? No. But, if you are dermatologist or other professional not familiar with IT you can ask someone to help you or just be happy to learn very small number of commands. Just follow and remember this guide and you do not need to learn anything complicate in order to accomplish task of copying a large amount of various files from one server to another that someone gave to you.

Our dermatologist has got those files copied to the laptop which he has to use to transfer files to other server. Since he received information that remote server was set up with SSH enabled he has to learn how to connect to it.
In order to connect to remote server he turned on his laptop and started terminal application that is terminal with text interface that he needs to start SSH session.

Terminal application window usually looks like this:

Terminal application window

Terminal application window will prompt you with username and computer name. No buttons, no complicated menus. Only prompt for commands. (Users that use screen readers can easily in textual interface perform all activities required.) However, the most important is that you understand logic what you want to do and to express it in some command. Once you get right commands you can literally copy them to the prompt.

Why Secure Shell? Why not some nice graphical user interface with my password? Security is not only feature. It must be principle. So, how SSH works? Using passwords only can be vulnerable method since many malicious bots and guys with powerful hardware can use various techniques to steel passwords from you. SSH is when security is concerned still better method to be used. Actually, SSH use so called SSH keys which are cryptographic matching pair of keys. One is private key and one is public key. Public key can be shared and exposed to others (do not do that anyway) but private key must not be exposed to anyone. The private key and processes of encryption and decryption during the establishment of connection are essential for security of SSH connection.

Firstly, we have to generate a pair of keys using command:

ssh-keygen

When we issue that command in our prompt the system will ask us to name the file in which key will be saved, and after that we will be asked to enter passphrase. Please type passphrase that you can remember. Your screen will look similar to the screen as on image below:

Outputs of ssh-keygen command

In the latest versions of GNU/Linux distributions ssh is usually configured by default to generate keys with high level security. In our case default values say that we have got RSA keys with 2048 bits.

After we generated the keys we should transfer public key to remote server by issuing command:

ssh-copy-id username@remote_host

Usually, administrator of that server or hosting company that set it up gave you the username and password while remote_host can be some address: medicaljournal.org or some IP number which looks like 193.243.27.183 or so. (This IP number is unknown to me so please do not use it, I typed just to show people without IT background how it can look like.)

After that you can connect to your serve by issuing command:

ssh -p 2020 username@remote_host

Please note that we use option that SSH is open for connection on port 2020 because sometimes hosting companies use that port instead of default 22. If your hosting company use port 22 you do not need to write “- p 2020”. If they use other port you can use that port and it will be “-p numberofport”. After issuing command we will have on prompt something like:

The authenticity of host '[name or ip number of your host will be here]:numberofport ([name or number of your host]:numberofport)' can't be established.
ECDSA key fingerprint is SHA256:y9aVJtMpIZusjf3bmSEtWg/9RwjTrCbAT0Tli9pvLmM.
Are you sure you want to continue connecting (yes/no)?

When you type “yes” and press Enter it will ask you password of that server. If you are scared you can type “logout” and press Enter and system will log you out. So far so good. Nothing exploded, you are safe and you have done wonderful work which you have to do only once.

Finally copying. But, I have questions!

When I showed that to dermatologist he felt something between happiness due to some new discovery and a sort of stage fright. Should I be able to do that to the end properly?, was visible on his face. But, I have some questions, he said. I have been told that I have to do that with port 2020. Secondly, I will have always on my computer a number of new files how can I copy them on the server? One by one? Should I have some paper evidence of what I copied?

Well, I started, there is easy answer to your questions. We can combine commands rsync and ssh if you have many files. Firstly, you have to keep your file sin some folder on your computer and we should issue this command:

rsync -avz -e "ssh -p $portNumber" source destination

The system will first time copy all files from the computer that in folders contains all files and copy them on remote computer. After copying them firstly it will copy only new files in the second turn. (rsync stands for remote sync)

In the case of our dermatologist he has got folder /research-files which had a lot of subfolders with names of researchers and each has had subfolders /articles /statistics /measurements /photos and /diagrams. He wants that on remote server should be the same principle applied with names of folders and subfolders. On remote server in /home folder administrator created user researcher and the user’s folder named researcher. All files should be copied to that folder using port 2020 in ssh. Sounds complicated, but it is not. on his laptop he has got folder /home and user /dermatology in which all folders and files are copied.

The command will be like this:

rsync -avz -e "ssh -p 2020" /home/dermatology/research-files researcher@remote_host: /home/researcher/

Instead of remote_host type your host name or IP number. After issuing this command the system will ask you to type password of your user. After typing password the process of remote sync will begin. Due to flag v in “avz” which stands for “verbose” you will see on your screen whole process. Duration of the process will depend on your upload speed. You can send mails, open documents, play music on your computer. The first turn can last longer if you have many files. But, the second and other turns will transfer only difference which means files that are added after the first turn. That will probably be considerably shorter. That’s all. Not too hard.

Intuition is better than manual?

In more than 25 years of working with software I have passed through various stages of using software. I was struggling with bad books, close to empty help files. I stumbled upon parts of interface, guessing what is logic behind, why workflow for some operations and procedures is not obvious. I could not do even simple things since books were scarce and expensive.

Translators who translated manuals and various technical books often invented new terminology and sometimes used awkward sentences in order to translate quite technical text on language which does not have those words in vocabulary. Some linguists were horrified with new intrusion of foreign languages and often coined their own  terms based on a lack of knowledge about technology concerned, misinterpretation of concepts and meaning.

Writing style was often very modest at least. There were no typographic differences between narrative text and command lines, illustrations too general and not representing the process described in paragraph.  Instructions were written in a way that can understand only those who already know that software. Those who did not know to use that software were mostly lost.  Procedures were not described concisely, some steps missing and above all, some examples were not workable.

On the other side, we have to be honest and confess that at the time, despite the fact that more than 20 years passed after wider using of operating systems and the development of software,  the software technology was at the beginning of introduction to the wider user population.  The code for Apollo mission was written by hand. Image bellow taken from Wikipedia shows Margaret Hamilton next to a stack of code she and her team wrote for the Apollo Mission computers.

software developer with books with code

The development of software and its maintenance was rather a hard way with a number of difficulties.

I strongly recommend those interested in history of software development to read article”No Silver Bullet – Essence and Accident in Software Engineering” by Fred Brooks in 1986.  Very interesting Wikipedia article states that: “Brooks distinguishes between two different types of complexity: accidental complexity and essential complexity. Accidental complexity relates to problems which engineers create and can fix; for example, the details of writing and optimizing assembly code or the delays caused by batch processing. Essential complexity is caused by the problem to be solved, and nothing can remove it; if users want a program to do 30 different things, then those 30 things are essential and the program must do those 30 different things. ”  Many scientists, software developers and businessmen noted that software grow faster in size and complexity than are invented methods to handle such complexities.  Some software companies initiated marketing slogan that their software is “intuitive and user friendly” which proved to be just marketing slogan far from truth.
All of that sometimes created great confusion . I felt like some botanist looking for magic flower in the middle of jungle.  I felt I was stuck in the middle of rainforest with gigantic trees, huge bushes, my skin crisscrossed with scratches and covered up with blood stains. It looked like I wanted to fit a square peg to a round hole. Enormous insects around me and snarling of hungry beasts was frightening. How did I get here?  How to get out from this?

In addition, increasingly complex software was often not well developed  and crashed frequently which caused even more confusion.  It was sometimes fun, but sometimes I have had embarrassing experience. A number of hours wasted, loss of data, broken hard disk.  I asked myself: What I have done wrong?  Is all of that really so sensitive that pressing wrong key on keyboard can toast my computer?  Who will use this stuff if it is so easy to ruin all I have done?

At the same time, business companies followed old devastating principles that everything should become commodity and they invested less in the development of supporting documentation.  They rather established support service that is not neither cheap nor efficient.  The free software movement gave to everyone free access to code being protected by GNU/GPL license which granted everyone right to develop, modify, distribute and document software as they like. That was promising framework and social, legal and technological basis for more responsible development and use of software.  At the same time users of many software packages experienced a lack of support, partly documented manuals, a variety of undocumented features, and sometimes unhelpful, arrogant and cynical support guys.

Even when manuals were written properly they were often very large and nicely illustrated books.  Too small number of people have had enough time to read them. I strongly believe that those who have had enough time for studying such manuals benefited a lot.

But, use of software manual is not based only on existence of sufficiently illustrated and written books.  Since I am more than 10 years involved in open access movement I strongly believe(d) that people from academia would use software manuals in a more systematic and principled way. But, my experience is very different.

There may be different reasons for that:

  • not sufficient time for reading detailed manuals
  • scientists and various scholars develop their own software for some purpose so they know it
  • software is developed for special purpose and it will not be used by wider audience
  • manuals and support are expensive or nonexistent
  • scholars and scientists are tired of reading and exhausting tedious work at academia so any additional obligation to read is rather avoided
  • some scholars and scientists display various psychological traits when being confronted with new scientific areas including software
  • some scholars and scientists have (un)diagnosed dyslexia, discalculia or other neurological conditions on a (sub)clinical level which prevents them from detailed reading of manuals, following procedures, remembering various relationships and hierarchies in managing various contexts managed by proper use of software features
  • some scholars and scientists for various reasons are print disabled and they do need assistive technologies to help them to learn to efficiently to use software

I am sure that there may be other reasons too.  But, those who write software and support scientists especially in the field of open access use of software should observe the following  principles:

  • written manuals should be written and tailored to the needs of some institution/library/journal
  • manuals should present procedures with examples from real use of software in a given context according to the version of software the is in production use by the users
  • manuals should describe each step in procedure or workflow and present screenshot that user see on the screen
  • manuals should be written in an accessible format
  • manuals should be enriched with infographics and other illustrations intended to present information in a brief and clear way.  They  can improve learning process by utilizing graphics to enhance the user’s ability to understand structures, workflows and procedures
  • manuals should offer links with screencasts that will show to the users how some tasks can be accomplished
  • software features that are invented with aim to improve workflow of users should be documented with examples of the real scenarios and contexts for a given group of users (i.e. journal, institute, faculty, library)
  • manuals, infographics and screencasts should be licensed with soem acceptable Creative Commons or GNU Free Documentation License

Well, one can say that it is not easy to do. But, software as manuals are a significant part of social interaction of various types of users (in the context of open access there may be readers, authors, reviewers, editors, librarians, lawyers, students, businessmen, policy makers). That is not commodity in a box. The manuals and software are part of ongoing incremental build model due to very dynamic development of various standards and services associated with academic publishing. Contemporary  software platforms require developers and users to take pace of the development of the internet, academic publishing, various technological, cultural, social, legal and scientific challenges and developments.  Consequently, when some editorial board or institution plans to deploy some software platform for academic and scientific publishing it is needed to plan continuous development, testing, support, documentation development and distribution.

Manuals for Open Journal Systems

I have found that many editorial boards struggle with a lack of concise instruction materials and a lack of people who can train them with hands-on approach.  They usually find some solutions in on-line forums, but it is time consuming for editorial boards to spend so much time and look for partial information. Sometimes people who write manuals do not explain each step.  Several people contacted me and asked: “What I have to do now?  Something is missing.”

System administrators, software developers assume that what is easy to them it should be easy to everyone.  They plan training to be done in one evening because “It is easy. ” In my experience, I often found out that such practice leads to misconfiguration of application, underuse of its features,  mistakes in performing workflow tasks and procedures.  Work with applications as the Open Journal Systems is not hard but it is complex and it takes some time until user is familiar with its functionality and simple procedures for configuration and efficient use.

I wrote manuals for authors, editorial boards and reviewers for scientific journals according to their needs.

You can find here manual for authors, editorial boards, and reviewers.

I will publish here soon manual that puts together some basic administrative and editorial functions aimed to successful configuration of your Open Journal Systems application for your journal.

 

CrossRef, DOI, XML, easy to do?

Many people from editorial boards asked me various questions about registering their journal with CrossRef. What is DOI? Is that XML thing too complicated?  Do we need someone with PhD to do that?

CrossRef is a not-for-profit membership organization for scholarly publishing working to make content easy to find, cite, link, and assess. We do it in five ways: rallying the community; tagging metadata; running a shared infrastructure; playing with new technology; and making tools and services to improve research communications.  The Digital Object Identifier, DOI is special number assigned uniquely to publications such as article, issue, galley, dataset, book, database etc.  There is interesting Wikipedia article about DOI for those who do not have much time to go into details.

People from CrossRef created a series of training materials which you can find on their Youtube channel.

I found very useful to watch their training on content registration and maintaining metadata information.  You can find a lot of useful information in that video training.  If you prefer slide presentation CrossRef published on Slideshare presentation about the same topic.

I always suggest to those who do not want to spend a lot of time in technical work in process of metadatadeposit and xml formatting to use the Open Journal Systems.

It is very easy to use Open Journal Systems and assign DOI numbers. Easy to use interface and pretty automatized process of metadata deposit save you a lot of time and effort.   There is special plugin for DOI assignment to your articles or other article/publication components. The users of 2.4.x branch of OJS can find information on assigning DOIs here.

The users of 3.x branch of OJS can do that even easier in less than 2 minutes configuration of plugin.  Huh, you will see that DOI and XML exports are not so hard thing. After using OJS you can ask yourself why you have had a lot of anxiety while thinking on things that are so easy to do.